From 60c2c643b05bad2cf9ef9f39b29a2bb82e759db3 Mon Sep 17 00:00:00 2001 From: Dmitrii Kuvaiskii Date: Tue, 21 Jun 2022 10:18:48 -0700 Subject: [PATCH] [LibOS,Pal] Add flexible device-specific IOCTL support The newly added `ioctl()` syscall emulation on device-backed file descriptors is pass-through. It is insecure by itself since the emulation only passes the arguments to and from the untrusted memory. It is the responsibility of the app developer to correctly use ioctls, with security implications in mind. On the Linux-SGX PAL, a set of IOCTL requests must be explicitly allowed in the manifest via the new option `sgx.allowed_ioctls.[id].request`. Also, the allowed IOCTLs' arguments (typically pointers to complex nested objects) must be explicitly described in the manifest via the new options `sgx.ioctl_structs.[id]` and a corresponding reference `sgx.allowed_ioctls.[id].struct`; see docs for explanation of the IOCTL struct format. This commit adds a new LibOS test `device_ioctl` that tests the flexible IOCTL logic against Gramine dummy device `/dev/gramine_test_dev`. This device is found in companion repo `gramineproject/device-testing-tools`. Signed-off-by: Dmitrii Kuvaiskii --- Documentation/manifest-syntax.rst | 94 +++ libos/src/sys/libos_ioctl.c | 24 +- libos/test/regression/device_ioctl.c | 200 +++++ .../regression/device_ioctl.manifest.template | 72 ++ libos/test/regression/meson.build | 1 + libos/test/regression/test_libos.py | 4 + libos/test/regression/tests.toml | 1 + libos/test/regression/tests_musl.toml | 1 + pal/include/pal/pal.h | 19 + pal/include/pal_internal.h | 1 + pal/src/host/linux-sgx/enclave_ocalls.c | 28 + pal/src/host/linux-sgx/enclave_ocalls.h | 2 + pal/src/host/linux-sgx/host_ocalls.c | 7 + pal/src/host/linux-sgx/pal_devices.c | 728 ++++++++++++++++++ pal/src/host/linux-sgx/pal_ocall_types.h | 7 + pal/src/host/linux-sgx/pal_tls.h | 1 + pal/src/host/linux/pal_devices.c | 15 + pal/src/host/skeleton/pal_devices.c | 4 + pal/src/pal_misc.c | 4 + pal/src/pal_symbols | 1 + 20 files changed, 1212 insertions(+), 2 deletions(-) create mode 100644 libos/test/regression/device_ioctl.c create mode 100644 libos/test/regression/device_ioctl.manifest.template diff --git a/Documentation/manifest-syntax.rst b/Documentation/manifest-syntax.rst index bb3dc6cd1c..b444fd3252 100644 --- a/Documentation/manifest-syntax.rst +++ b/Documentation/manifest-syntax.rst @@ -717,6 +717,100 @@ they were listed as ``allowed_files``. (However, this policy still does not allow writing/creating files specified as trusted.) This policy is a convenient way to determine the set of files that the ported application uses. +Allowed IOCTLs +^^^^^^^^^^^^^^ + +:: + + sgx.ioctl_structs.[identifier] = [memory-layout-format] + + sgx.allowed_ioctls.[identifier].request = [NUM] + sgx.allowed_ioctls.[identifier].struct = "[identifier-of-ioctl-struct]" + +By default, Gramine with SGX disables all device-backed IOCTLs. This syntax +allows to explicitly allow a set of IOCTLs on devices (devices must be +explicitly mounted via ``fs.mounts`` manifest syntax). Only IOCTLs with the +``request`` argument found among the manifest-listed IOCTLs are allowed to +pass-through to the host. Each IOCTL entry must also contain a reference to an +IOCTL struct in its ``struct`` field. + +Available IOCTL structs are described via ``sgx.ioctl_structs``. Each IOCTL +struct describes the memory layout of the ``arg`` argument (typically a pointer +to a complex nested object passed to the device). Description of the memory +layout is required for a deep copy of the argument. The memory layout is +described using the TOML syntax of inline arrays (for each new separate memory +region) and inline tables (for each sub-region in one memory region). Each +sub-region is described via the following keys: + +- ``name`` is an optional name for this sub-region; mainly used to find + length-specifying fields and nested memory regions. +- ``align`` is an optional alignment of the memory region; may be specified only + in the first sub-region of a memory region (all other sub-regions are + contigious with the first sub-region, so specifying their alignment doesn't + make sense). +- ``size`` is a mandatory size of this sub-region. The ``size`` field may be a + string with the name of another field that contains the size value or an + integer with the constant size measured in ``units`` (default unit is 1 byte; + also see below). For example, ``size = "strlen"`` denotes a size field that + will be calculated dynamically during IOCTL execution based on the sub-region + named ``strlen``, whereas ``size = 16`` denotes a sub-region of size 16B. Note + that for ``ptr`` sub-regions, the ``size`` field has a different meaning: it + denotes the number of adjacent memory regions (in other words, it denotes the + number of items in the ``ptr`` array). +- ``unit`` is an optional unit of measurement for ``size``. It is 1 byte by + default. Unit of measurement must be a constant integer. For example, + ``size = "strlen"`` and ``unit = 2`` denote a wide-char string (where each + character is 2B long) of a dynamically calculated length. +- ``adjust`` is an optional integer adjustment for ``size``. It is 0 bytes by + default. This field must be a constant (possibly negative) integer. For + example, ``adjust = -8`` and ``size = 12`` results in a total size of 4B. +- ``type = ["none" | "out" | "in" | "inout"]`` is an optional direction of copy + for this sub-region. For example, ``type = "out"`` denotes a sub-region to be + copied out of the enclave to untrusted memory, i.e., this sub-region is an + input to the host device. The default value is ``none`` which is useful for + e.g. padding of structs. This field may be ommitted if the ``ptr`` field is + specified for this sub-region (pointer sub-regions contain the pointer value + which will be unconditionally rewired to point to untrusted memory). +- ``ptr = [ another memory region ]`` or ``ptr = "another-memory-region"`` + specifies a pointer to another, nested memory region. This field is required + when describing complex IOCTL structs. Such pointer memory region always has + the implicit size of 8B, and the pointer value is always rewired to the memory + region in untrusted memory (containing a copied-out nested memory region). If + ``ptr`` is specified together with ``size``, it describes not just a pointer + but an array of these memory regions. A special keyword ``ptr = "this"`` + specifies a pointer to the memory region of the IOCTL struct's root memory + layout. + +Consider this simple example:: + + sgx.ioctl_structs.st1 = [ { ptr=[ {name="nested_region", align=4096, size=4096, type="out"} ] } ] + +The above example specifies a root struct (first memory region) that consists +of a single sub-region that contains an 8-byte pointer value. This pointer +points to another memory region in enclave memory that contains a single +sub-region of size 4KB and that must be 4KB-aligned. This nested sub-region has +a name ``nested_region`` (not used, only for illustrative purposes). Also, this +nested sub-region is copied out of the enclave. The pointer value of the first +memory region is rewired to point to the copied-out second memory region in +untrusted memory. No fields/memory regions are copied back from untrusted memory +inside the enclave after an IOCTL with this struct executes. + +If the IOCTL's third argument is simply an integer (or unused at all), then the +syntax must specify the struct as an empty TOML array:: + + sgx.ioctl_structs.st2 = [ ] + +IOCTLs that use these structs are defined like this:: + + sgx.allowed_ioctls.io1.request = 0x12345678 + sgx.allowed_ioctls.io1.struct = "st1" + + sgx.allowed_ioctls.io2.request = 0x87654321 + sgx.allowed_ioctls.io2.struct = "st1" + + sgx.allowed_ioctls.io3.request = 0x43218765 # this IOCTL's arg is passed as-is + sgx.allowed_ioctls.io3.struct = "st2" + Attestation and quotes ^^^^^^^^^^^^^^^^^^^^^^ diff --git a/libos/src/sys/libos_ioctl.c b/libos/src/sys/libos_ioctl.c index 8c26a008b2..b7306c96c7 100644 --- a/libos/src/sys/libos_ioctl.c +++ b/libos/src/sys/libos_ioctl.c @@ -13,6 +13,7 @@ #include "libos_signal.h" #include "libos_table.h" #include "pal.h" +#include "stat.h" static void signal_io(IDTYPE caller, void* arg) { __UNUSED(caller); @@ -116,9 +117,28 @@ long libos_syscall_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg) { ret = 0; break; } - default: - ret = -ENOSYS; + default: { + lock(&g_dcache_lock); + bool is_host_dev = hdl->type == TYPE_CHROOT && hdl->dentry->inode && + hdl->dentry->inode->type == S_IFCHR; + unlock(&g_dcache_lock); + + if (!is_host_dev) { + ret = -ENOSYS; + break; + } + + int cmd_ret; + ret = PalDeviceIoControl(hdl->pal_handle, cmd, arg, &cmd_ret); + if (ret < 0) { + ret = pal_to_unix_errno(ret); + break; + } + + assert(ret == 0); + ret = cmd_ret; break; + } } put_handle(hdl); diff --git a/libos/test/regression/device_ioctl.c b/libos/test/regression/device_ioctl.c new file mode 100644 index 0000000000..416faf7078 --- /dev/null +++ b/libos/test/regression/device_ioctl.c @@ -0,0 +1,200 @@ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "rw_file.h" + +#define STRING_READWRITE "Hello world via read/write\n" +#define STRING_IOCTL "Hello world via ioctls\n" +#define STRING_IOCTL_REPLACED "He$$0 w0r$d via i0ct$s\n" + +struct gramine_test_dev_ioctl_write { + size_t buf_size; /* in */ + const char* buf; /* in */ + ssize_t off; /* in/out -- updated after write */ + ssize_t copied; /* out -- how many bytes were actually written */ +}; + +struct gramine_test_dev_ioctl_read { + size_t buf_size; /* in */ + char* buf; /* out */ + ssize_t off; /* in/out -- updated after read */ + ssize_t copied; /* out -- how many bytes were actually read */ +}; + +struct gramine_test_dev_ioctl_replace_char { + char src; /* in */ + char dst; /* in */ + char pad[6]; +}; + +struct gramine_test_dev_ioctl_replace_arr { + /* array of replacements, e.g. replacements_cnt == 2 and [`l` -> `$`, `o` -> `0`] */ + size_t replacements_cnt; + struct gramine_test_dev_ioctl_replace_char* replacements_arr; +}; + +struct gramine_test_dev_ioctl_replace_list { + /* list of replacements, e.g. [`l` -> `$`, next points to `o` -> `0`, next points to NULL] */ + struct gramine_test_dev_ioctl_replace_char replacement; + struct gramine_test_dev_ioctl_replace_list* next; +}; + +#define GRAMINE_TEST_DEV_IOCTL_BASE 0x33 + +#define GRAMINE_TEST_DEV_IOCTL_REWIND _IO(GRAMINE_TEST_DEV_IOCTL_BASE, 0x00) +#define GRAMINE_TEST_DEV_IOCTL_WRITE _IOWR(GRAMINE_TEST_DEV_IOCTL_BASE, 0x01, \ + struct gramine_test_dev_ioctl_write) +#define GRAMINE_TEST_DEV_IOCTL_READ _IOWR(GRAMINE_TEST_DEV_IOCTL_BASE, 0x02, \ + struct gramine_test_dev_ioctl_read) +#define GRAMINE_TEST_DEV_IOCTL_GETSIZE _IO(GRAMINE_TEST_DEV_IOCTL_BASE, 0x03) +#define GRAMINE_TEST_DEV_IOCTL_CLEAR _IO(GRAMINE_TEST_DEV_IOCTL_BASE, 0x04) +#define GRAMINE_TEST_DEV_IOCTL_REPLACE_ARR _IOW(GRAMINE_TEST_DEV_IOCTL_BASE, 0x05, \ + struct gramine_test_dev_ioctl_replace_arr) +#define GRAMINE_TEST_DEV_IOCTL_REPLACE_LIST _IOW(GRAMINE_TEST_DEV_IOCTL_BASE, 0x06, \ + struct gramine_test_dev_ioctl_replace_list) + +int main(int argc, char* argv[]) { + int ret; + ssize_t bytes; + char buf[64]; + + int devfd = open("/dev/gramine_test_dev", O_RDWR); + if (devfd < 0) + err(1, "/dev/gramine_test_dev open"); + + /* test 1 -- use write() and read() syscalls */ + bytes = posix_fd_write(devfd, STRING_READWRITE, sizeof(STRING_READWRITE)); + if (bytes < 0) + return EXIT_FAILURE; + + /* lseek() doesn't work in Gramine because it is fully emulated in LibOS and therefore lseek() + * is not aware of device-specific semantics; instead we use a device-specific ioctl() */ + off_t offset = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_REWIND); + if (offset < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_REWIND)"); + if (offset > 0) + errx(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_REWIND) didn't return 0 " + "(returned: %ld)", offset); + + memset(&buf, 0, sizeof(buf)); + bytes = posix_fd_read(devfd, buf, sizeof(buf) - 1); + if (bytes < 0) + return EXIT_FAILURE; + + if (strcmp(buf, STRING_READWRITE)) + errx(1, "read `%s` from /dev/gramine_test_dev but expected `%s`", buf, STRING_READWRITE); + + ssize_t devfd_size = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_GETSIZE); + if (devfd_size < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_GETSIZE)"); + if (devfd_size != sizeof(STRING_READWRITE)) + errx(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_GETSIZE) didn't return %lu " + "(returned: %ld)", sizeof(STRING_READWRITE), devfd_size); + + /* test 2 -- use ioctl(GRAMINE_TEST_DEV_IOCTL_WRITE) and ioctl(GRAMINE_TEST_DEV_IOCTL_READ) + * syscalls */ + ret = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_CLEAR); + if (ret < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_CLEAR)"); + + struct gramine_test_dev_ioctl_write write_arg = { + .buf_size = sizeof(STRING_IOCTL), + .buf = STRING_IOCTL, + .off = 0 + }; + ret = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_WRITE, &write_arg); + if (ret < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_WRITE)"); + if (write_arg.off != sizeof(STRING_IOCTL)) + errx(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_WRITE) didn't update offset " + "to %lu (returned: %ld)", sizeof(STRING_IOCTL), write_arg.off); + if (write_arg.copied != sizeof(STRING_IOCTL)) + errx(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_WRITE) didn't copy %lu bytes " + "(returned: %ld)", sizeof(STRING_IOCTL), write_arg.copied); + + memset(buf, 0, sizeof(buf)); + struct gramine_test_dev_ioctl_read read_arg = { + .buf_size = sizeof(buf) - 1, + .buf = buf, + .off = 0 + }; + ret = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_READ, &read_arg); + if (ret < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_READ)"); + if (read_arg.off != sizeof(STRING_IOCTL)) + errx(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_READ) didn't update offset " + "to %lu (returned: %ld)", sizeof(STRING_IOCTL), read_arg.off); + if (read_arg.copied != sizeof(STRING_IOCTL)) + errx(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_READ) didn't copy %lu bytes " + "(returned: %ld)", sizeof(STRING_IOCTL), read_arg.copied); + + if (strcmp(buf, STRING_IOCTL)) + errx(1, "read `%s` from /dev/gramine_test_dev but expected `%s`", buf, STRING_IOCTL); + + devfd_size = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_GETSIZE); + if (devfd_size < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_GETSIZE)"); + if (devfd_size != sizeof(STRING_IOCTL)) + errx(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_GETSIZE) didn't return %lu " + "(returned: %ld)", sizeof(STRING_IOCTL), devfd_size); + + /* test 3 -- use complex ioctl(GRAMINE_TEST_DEV_IOCTL_REPLACE_ARR) syscall */ + struct gramine_test_dev_ioctl_replace_char replace_chars[] = { + { .src = 'l', .dst = '$' }, + { .src = 'o', .dst = '0' } + }; + struct gramine_test_dev_ioctl_replace_arr replace_arr = { + .replacements_cnt = 2, + .replacements_arr = replace_chars + }; + ret = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_REPLACE_ARR, &replace_arr); + if (ret < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_REPLACE_ARR)"); + + memset(buf, 0, sizeof(buf)); + read_arg.off = 0; + ret = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_READ, &read_arg); + if (ret < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_READ)"); + if (strcmp(buf, STRING_IOCTL_REPLACED)) + errx(1, "read `%s` from /dev/gramine_test_dev but expected `%s`", buf, + STRING_IOCTL_REPLACED); + + /* test 4 -- use complex ioctl(GRAMINE_TEST_DEV_IOCTL_REPLACE_LIST) syscall */ + struct gramine_test_dev_ioctl_replace_list replace_list_2 = { + .replacement = { .src = '0', .dst = 'o' }, + .next = NULL + }; + struct gramine_test_dev_ioctl_replace_list replace_list = { + .replacement = { .src = '$', .dst = 'l' }, + .next = &replace_list_2 + }; + + ret = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_REPLACE_LIST, &replace_list); + if (ret < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_REPLACE_LIST)"); + + memset(buf, 0, sizeof(buf)); + read_arg.off = 0; + ret = ioctl(devfd, GRAMINE_TEST_DEV_IOCTL_READ, &read_arg); + if (ret < 0) + err(1, "/dev/gramine_test_dev ioctl(GRAMINE_TEST_DEV_IOCTL_READ)"); + if (strcmp(buf, STRING_IOCTL)) + errx(1, "read `%s` from /dev/gramine_test_dev but expected `%s`", buf, STRING_IOCTL); + + ret = close(devfd); + if (ret < 0) + err(1, "/dev/gramine_test_dev close"); + + puts("TEST OK"); + return 0; +} diff --git a/libos/test/regression/device_ioctl.manifest.template b/libos/test/regression/device_ioctl.manifest.template new file mode 100644 index 0000000000..172c809947 --- /dev/null +++ b/libos/test/regression/device_ioctl.manifest.template @@ -0,0 +1,72 @@ +loader.entrypoint = "file:{{ gramine.libos }}" +libos.entrypoint = "{{ entrypoint }}" + +loader.argv0_override = "{{ entrypoint }}" +loader.env.LD_LIBRARY_PATH = "/lib" + +fs.mounts = [ + { path = "/lib", uri = "file:{{ gramine.runtimedir(libc) }}" }, + { path = "/{{ entrypoint }}", uri = "file:{{ binary_dir }}/{{ entrypoint }}" }, + { path = "/dev/gramine_test_dev", uri = "dev:/dev/gramine_test_dev" }, +] + +sgx.nonpie_binary = true +sgx.debug = true + +sgx.trusted_files = [ + "file:{{ gramine.libos }}", + "file:{{ gramine.runtimedir(libc) }}/", + "file:{{ binary_dir }}/{{ entrypoint }}", +] + +# for IOCTLs without an argument (or with integer argument) +sgx.ioctl_structs.gramine_test_dev_ioctl_dummy = [ ] + +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_REWIND.request = 0x3300 +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_REWIND.struct = "gramine_test_dev_ioctl_dummy" + +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_GETSIZE.request = 0x3303 +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_GETSIZE.struct = "gramine_test_dev_ioctl_dummy" + +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_CLEAR.request = 0x3304 +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_CLEAR.struct = "gramine_test_dev_ioctl_dummy" + +sgx.ioctl_structs.gramine_test_dev_ioctl_write = [ + { size=8, type="out", name="buf_size" }, # buf_size + { ptr=[ {size="buf_size", type="out"} ] }, # buf + { size=8, type="inout" }, # off + { adjust=-4, size=12, type="in" }, # copied; adjust is just for testing +] + +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_WRITE.request = 0xc0203301 +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_WRITE.struct = "gramine_test_dev_ioctl_write" + +sgx.ioctl_structs.gramine_test_dev_ioctl_read = [ + { size=8, type="out", name="buf_size" }, # buf_size + { ptr=[ {size="buf_size", type="in"} ] }, # buf + { size=8, type="inout" }, # off + { adjust=-4, size=12, type="in" }, # copied; adjust is just for testing +] + +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_READ.request = 0xc0203302 +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_READ.struct = "gramine_test_dev_ioctl_read" + +sgx.ioctl_structs.gramine_test_dev_ioctl_replace_arr = [ + { size=8, type="out", name="replacements_cnt" }, # replacements_cnt + { size="replacements_cnt", ptr=[ # replacements_arr + { size=2, units=1, type="out" }, # src, dst + { size=6, units=1, type="none" }, # pad + ] }, +] + +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_REPLACE_ARR.request = 0x40103305 +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_REPLACE_ARR.struct = "gramine_test_dev_ioctl_replace_arr" + +sgx.ioctl_structs.gramine_test_dev_ioctl_replace_list = [ + { size=2, units=1, type="out" }, # src, dst + { size=6, units=1, type="none" }, # pad + { ptr="this" }, # next +] + +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_REPLACE_LIST.request = 0x40103306 +sgx.allowed_ioctls.GRAMINE_TEST_DEV_IOCTL_REPLACE_LIST.struct = "gramine_test_dev_ioctl_replace_list" diff --git a/libos/test/regression/meson.build b/libos/test/regression/meson.build index aefba09839..7be78ff4ac 100644 --- a/libos/test/regression/meson.build +++ b/libos/test/regression/meson.build @@ -13,6 +13,7 @@ tests = { }, 'devfs': {}, 'device_passthrough': {}, + 'device_ioctl': {}, 'double_fork': {}, 'epoll_epollet': {}, 'epoll_test': {}, diff --git a/libos/test/regression/test_libos.py b/libos/test/regression/test_libos.py index b9d6c4e357..6df56d1880 100644 --- a/libos/test/regression/test_libos.py +++ b/libos/test/regression/test_libos.py @@ -947,6 +947,10 @@ def test_002_device_passthrough(self): stdout, _ = self.run_binary(['device_passthrough']) self.assertIn('TEST OK', stdout) + def test_003_device_ioctl(self): + stdout, _ = self.run_binary(['device_ioctl']) + self.assertIn('TEST OK', stdout) + def test_010_path(self): stdout, _ = self.run_binary(['proc_path']) self.assertIn('proc path test success', stdout) diff --git a/libos/test/regression/tests.toml b/libos/test/regression/tests.toml index 10a8e13291..f712e10c43 100644 --- a/libos/test/regression/tests.toml +++ b/libos/test/regression/tests.toml @@ -12,6 +12,7 @@ manifests = [ "debug_log_inline", "devfs", "device_passthrough", + "device_ioctl", "double_fork", "env_from_file", "env_from_host", diff --git a/libos/test/regression/tests_musl.toml b/libos/test/regression/tests_musl.toml index ddd6389731..706f2582be 100644 --- a/libos/test/regression/tests_musl.toml +++ b/libos/test/regression/tests_musl.toml @@ -14,6 +14,7 @@ manifests = [ "debug_log_inline", "devfs", "device_passthrough", + "device_ioctl", "double_fork", "env_from_file", "env_from_host", diff --git a/pal/include/pal/pal.h b/pal/include/pal/pal.h index c0d42462ed..3178cd40e1 100644 --- a/pal/include/pal/pal.h +++ b/pal/include/pal/pal.h @@ -842,6 +842,25 @@ int PalSegmentBaseSet(enum pal_segment_reg reg, uintptr_t addr); */ size_t PalMemoryAvailableQuota(void); +/*! + * \brief Perform a device-specific operation `cmd`. + * + * \param handle Handle of the device. + * \param cmd Device-dependent request/control code. + * \param[in,out] arg Arbitrary argument to `cmd`. May be unused or used as a 64-bit integer + * or used as a pointer to a buffer that contains the data required to + * perform the operation as well as the data returned by the operation. For + * some PALs (e.g., Linux-SGX PAL), the manifest must describe the layout of + * this buffer in order to correctly copy the data to/from the host. + * \param[out] out_ret Typically zero, but some device-specific operations return a + * device-specific nonnegative value (in addition to or instead of \p arg). + * + * \returns 0 on success, negative error value on failure. + * + * This function corresponds to ioctl() in UNIX systems and DeviceIoControl() in Windows. + */ +int PalDeviceIoControl(PAL_HANDLE handle, uint32_t cmd, uint64_t arg, int* out_ret); + /*! * \brief Obtain the attestation report (local) with `user_report_data` embedded into it. * diff --git a/pal/include/pal_internal.h b/pal/include/pal_internal.h index 88314a697b..69acc410f4 100644 --- a/pal/include/pal_internal.h +++ b/pal/include/pal_internal.h @@ -176,6 +176,7 @@ void _PalGetAvailableUserAddressRange(void** out_start, void** out_end); bool _PalCheckMemoryMappable(const void* addr, size_t size); unsigned long _PalMemoryQuota(void); unsigned long _PalMemoryAvailableQuota(void); +int _PalDeviceIoControl(PAL_HANDLE handle, uint32_t cmd, uint64_t arg, int* out_ret); // Returns 0 on success, negative PAL code on failure int _PalGetCPUInfo(struct pal_cpu_info* info); diff --git a/pal/src/host/linux-sgx/enclave_ocalls.c b/pal/src/host/linux-sgx/enclave_ocalls.c index 543fef23f0..47f64bd32f 100644 --- a/pal/src/host/linux-sgx/enclave_ocalls.c +++ b/pal/src/host/linux-sgx/enclave_ocalls.c @@ -1980,6 +1980,34 @@ int ocall_eventfd(int flags) { return retval; } +int ocall_ioctl(int fd, unsigned int cmd, unsigned long arg) { + int retval; + ms_ocall_ioctl_t* ms; + + void* old_ustack = sgx_prepare_ustack(); + ms = sgx_alloc_on_ustack_aligned(sizeof(*ms), alignof(*ms)); + if (!ms) { + sgx_reset_ustack(old_ustack); + return -EPERM; + } + + WRITE_ONCE(ms->ms_fd, fd); + WRITE_ONCE(ms->ms_cmd, cmd); + WRITE_ONCE(ms->ms_arg, arg); + + do { + retval = sgx_exitless_ocall(OCALL_IOCTL, ms); + } while (retval == -EINTR); + + if (retval < 0 && retval != -EBADF && retval != -EFAULT && retval != -EINVAL && + retval != -ENOTTY) { + retval = -EPERM; + } + + sgx_reset_ustack(old_ustack); + return retval; +} + int ocall_get_quote(const sgx_spid_t* spid, bool linkable, const sgx_report_t* report, const sgx_quote_nonce_t* nonce, char** quote, size_t* quote_len) { int retval; diff --git a/pal/src/host/linux-sgx/enclave_ocalls.h b/pal/src/host/linux-sgx/enclave_ocalls.h index c5e6a1385e..dff5a61e78 100644 --- a/pal/src/host/linux-sgx/enclave_ocalls.h +++ b/pal/src/host/linux-sgx/enclave_ocalls.h @@ -105,6 +105,8 @@ int ocall_debug_describe_location(uintptr_t addr, char* buf, size_t buf_size); int ocall_eventfd(int flags); +int ocall_ioctl(int fd, unsigned int cmd, unsigned long arg); + /*! * \brief Execute untrusted code in PAL to obtain a quote from the Quoting Enclave. * diff --git a/pal/src/host/linux-sgx/host_ocalls.c b/pal/src/host/linux-sgx/host_ocalls.c index d62d1df193..ad88c99a4f 100644 --- a/pal/src/host/linux-sgx/host_ocalls.c +++ b/pal/src/host/linux-sgx/host_ocalls.c @@ -734,6 +734,12 @@ static long sgx_ocall_debug_describe_location(void* pms) { #endif } +static long sgx_ocall_ioctl(void* pms) { + ms_ocall_ioctl_t* ms = (ms_ocall_ioctl_t*)pms; + long ret = DO_SYSCALL(ioctl, ms->ms_fd, ms->ms_cmd, ms->ms_arg); + return ret; +} + static long sgx_ocall_get_quote(void* pms) { ms_ocall_get_quote_t* ms = (ms_ocall_get_quote_t*)pms; return retrieve_quote(ms->ms_is_epid ? &ms->ms_spid : NULL, ms->ms_linkable, &ms->ms_report, @@ -785,6 +791,7 @@ sgx_ocall_fn_t ocall_table[OCALL_NR] = { [OCALL_DEBUG_MAP_REMOVE] = sgx_ocall_debug_map_remove, [OCALL_DEBUG_DESCRIBE_LOCATION] = sgx_ocall_debug_describe_location, [OCALL_EVENTFD] = sgx_ocall_eventfd, + [OCALL_IOCTL] = sgx_ocall_ioctl, [OCALL_GET_QUOTE] = sgx_ocall_get_quote, }; diff --git a/pal/src/host/linux-sgx/pal_devices.c b/pal/src/host/linux-sgx/pal_devices.c index e4bc70a3ed..3cb745e903 100644 --- a/pal/src/host/linux-sgx/pal_devices.c +++ b/pal/src/host/linux-sgx/pal_devices.c @@ -18,6 +18,7 @@ #include "pal_linux.h" #include "pal_linux_error.h" #include "perm.h" +#include "toml.h" static int dev_open(PAL_HANDLE* handle, const char* type, const char* uri, enum pal_access access, pal_share_flags_t share, enum pal_create_mode create, @@ -218,3 +219,730 @@ struct handle_ops g_dev_ops = { .attrquery = &dev_attrquery, .attrquerybyhdl = &dev_attrquerybyhdl, }; + +/* + * Code below describes the deep-copy syntax in the TOML manifest used for copying complex nested + * objects out and inside the SGX enclave. This syntax is currently used for IOCTL emulation. This + * syntax is generic enough to describe any memory layout for deep copy of IOCTL structs. + * + * The following example describes the main implementation details: + * + * struct pascal_str { uint8_t len; char str[]; }; + * struct c_str { char str[]; }; + * struct root { struct pascal_str* s1; struct c_str* s2; uint64_t s2_len; int8_t x; int8_t y; }; + * + * alignas(128) struct root obj; + * ioctl(devfd, _IOWR(DEVICE_MAGIC, DEVICE_FUNC, struct root), &obj); + * + * The example IOCTL takes as a third argument a pointer to an object of type `struct root` that + * contains two pointers to other objects (pascal-style string and a C-style string) and embeds two + * integers `x` and `y`. The two strings reside in separate memory regions in enclave memory. Note + * that the max possible length of the C-style string is stored in the `s2_len` field of the root + * object. The `pascal_str` string is an input to the IOCTL, the `c_str` string and its length + * `s2_len` are the outputs of the IOCTL, and the integers `x` and `y` are also outputs of the + * IOCTL. Also note that the root object is 128B-aligned (for illustration purposes). This IOCTL + * could for example be used to convert a Pascal string into a C string (C string will be truncated + * to user-specified `s2_len` if greater than this limit), and find the indices of the first + * occurences of chars "x" and "y" in the Pascal string. + * + * The corresponding deep-copy syntax in TOML looks like this: + * + * sgx.ioctl_structs.ROOT_FOR_DEVICE_FUNC = [ + * { align = 128, ptr = [ {name="pascal-str-len", size=1, type="out"}, + * {name="pascal-str", size="pascal-str-len", type="out"} ] }, + * { ptr = [ {name="c-str", size="c-str-len", type="in"} ], size = 1 }, + * { name = "c-str-len", size = 8, unit = 1, adjust = 0, type = "inout" }, + * { size = 2, type = "in" } + * { size = 2, type = "in" } + * ] + * + * sgx.allowed_ioctls.DEVICE_FUNC.request = + * sgx.allowed_ioctls.DEVICE_FUNC.struct = "ROOT_FOR_DEVICE_FUNC" + * + * One can observe the following rules in this TOML syntax: + * + * 1. Each separate memory region is represented as a TOML array (`[]`). + * 2. Each sub-region of one memory region is represented as a TOML table (`{}`). + * 3. Each sub-region may be a pointer (`ptr`) to another memory region. In this case, the value of + * `ptr` is a TOML-array representation of that other memory region. The `ptr` sub-region always + * has size of 8B (assuming x86-64) and doesn't have an in/out type. The `size` field of the + * `ptr` sub-region has a different meaning than for non-pointer sub-regions: it is the number + * of adjacent memory regions that this pointer points to (i.e. it describes an array). + * 4. Sub-regions can be fixed-size (like the last sub-region containing two bytes `x` and `y`) or + * can be flexible-size (like the two strings). In the latter case, the `size` field contains a + * name of a sub-region where the actual size is stored. + * 5. Sub-regions that store the size of another sub-region must be 1, 2, 4, or 8 bytes in size. + * 6. Sub-regions may have a name for ease of identification; this is required for "size" + * sub-regions but may be omitted for all other kinds of sub-regions. + * 7. Sub-regions may have one of the four types: "out" to copy contents of the sub-region outside + * the enclave to untrusted memory, "in" to copy from untrusted memory to inside the enclave, + * "inout" to copy in both directions, "none" to not copy at all (useful for e.g. padding). + * Note that pointer sub-regions do not have a type (their values are unconditionally rewired so + * as to point to the copied-out region in untrusted memory). + * 8. The first sub-region (and only the first!) may specify the alignment of the memory region. + * 9. The total size of a sub-region is calculated as `size * unit + adjust`. By default `unit` is + * 1 byte and `adjust` is 0. Note that `adjust` may be a negative number. + * + * The diagram below shows how this complex object is copied from enclave memory (left side) to + * untrusted memory (right side). MR stands for "memory region", SR stands for "sub-region". Note + * how enclave pointers are copied and rewired to point to untrusted memory regions. + * + * struct root (MR1) | deep-copied struct (aligned at 128B) + * +------------------+ | +------------------------+ + * +----+ pascal_str* s1 | SR1 | +----+ pascal_str* s1 (MR1)| + * | | | | | | | + * | | c_str* s2 +-------+ SR2 | | | c_str* s2 +-------------+ + * | | | | | | | | | + * | | uint64_t s2_len | | SR3 | | | uint64_t s2_len | | + * | | | | | | | | | + * | | int8_t x, y | | SR4 | | | int8_t x=0, y=0 | | + * | +------------------+ | | | +------------------------+ | + * | | | +->| uint8_t len (MR2)| | + * v (MR2) | | | | | + * +-------------+ | | | char str[len] | | + * | uint8_t len | | SR5 | +------------------------+ | + * | | | | | char str[s2_len] (MR3)|<-+ + * | char str[] | | SR6 | +------------------------+ + * +-------------+ | | + * (MR3) v | + * +----------+-+ | + * | char str[] | SR7 | + * +------------+ | + * + */ + +/* for simplicity, we limit the number of memory and sub-regions; these limits should be enough for + * any reasonable IOCTL struct object */ +#define MAX_MEM_REGIONS 1024 +#define MAX_SUB_REGIONS (10 * 1024) + +/* direction of copy: none (used for padding), out of enclave, inside enclave, both or a special + * "pointer" sub-region; default is COPY_NONE_ENCLAVE */ +enum mem_copy_type {COPY_NONE_ENCLAVE, COPY_OUT_ENCLAVE, COPY_IN_ENCLAVE, COPY_INOUT_ENCLAVE, + COPY_PTR_ENCLAVE}; + +struct mem_region { + toml_array_t* toml_array; /* describes contigious sub_regions in this mem_region */ + void* encl_addr; /* base address of this memory region in enclave memory */ + bool adjacent; /* memory region adjacent to previous one? (used for arrays) */ +}; + +struct sub_region { + enum mem_copy_type type; /* direction of copy during OCALL (or pointer to another region) */ + char* name; /* may be NULL for unnamed regions */ + uint64_t name_hash; /* hash of "name" for fast string comparison */ + ssize_t align; /* alignment of this sub-region */ + ssize_t size; /* may be dynamically determined from another sub-region */ + char* size_name; /* needed if "size" sub region is defined after this sub region */ + uint64_t size_name_hash; /* needed if "size" sub region is defined after this sub region */ + ssize_t unit; /* total size in bytes is calculated as `size * unit + adjust` */ + ssize_t adjust; /* may be negative; total size in bytes is `size * unit + adjust` */ + void* encl_addr; /* base address of this sub region in enclave memory */ + void* untrusted_addr; /* base address of the corresponding sub region in untrusted memory */ + toml_array_t* mem_ptr; /* for pointers/arrays, specifies pointed-to mem region */ +}; + +static inline uint64_t hash(char* str) { + /* simple hash function djb2 by Dan Bernstein, used for quick comparison of strings */ + uint64_t hash = 5381; + char c; + while ((c = *str++)) + hash = ((hash << 5) + hash) + c; + return hash; +} + +static bool strings_equal(const char* s1, const char* s2, uint64_t s1_hash, uint64_t s2_hash) { + if (!s1 || !s2 || s1_hash != s2_hash) + return false; + assert(s1_hash == s2_hash); + return !strcmp(s1, s2); +} + +/* finds a sub region with name `sub_region_name` among `sub_regions` and returns its index */ +static int get_sub_region_idx(struct sub_region* sub_regions, size_t sub_regions_cnt, + const char* sub_region_name, uint64_t sub_region_name_hash, + size_t* out_idx) { + /* it is important to iterate in reverse order because there may be an array of mem regions + * with same-named sub regions, and we want to find the latest sub region */ + for (size_t i = sub_regions_cnt; i > 0; i--) { + size_t idx = i - 1; + if (strings_equal(sub_regions[idx].name, sub_region_name, + sub_regions[idx].name_hash, sub_region_name_hash)) { + /* found corresponding sub region */ + if (sub_regions[idx].type != COPY_PTR_ENCLAVE || !sub_regions[idx].mem_ptr) { + /* sub region is not a valid pointer to a memory region */ + return -PAL_ERROR_DENIED; + } + *out_idx = idx; + return 0; + } + } + return -PAL_ERROR_NOTDEFINED; +} + +/* finds a sub region with name `sub_region_name` among `sub_regions` and reads the value in it */ +static int get_sub_region_value(struct sub_region* sub_regions, size_t sub_regions_cnt, + const char* sub_region_name, uint64_t sub_region_name_hash, + ssize_t* out_value) { + /* it is important to iterate in reverse order because there may be an array of memory regions + * with same-named sub regions, and we want to find the "latest value" sub region, i.e. the one + * belonging to the same memory region */ + for (size_t i = sub_regions_cnt; i > 0; i--) { + size_t idx = i - 1; + if (strings_equal(sub_regions[idx].name, sub_region_name, + sub_regions[idx].name_hash, sub_region_name_hash)) { + /* found corresponding sub region, read its value */ + if (!sub_regions[idx].encl_addr || sub_regions[idx].encl_addr == (void*)-1) { + /* enclave address is invalid, user provided bad struct */ + return -PAL_ERROR_DENIED; + } + + if (sub_regions[idx].size == sizeof(uint8_t)) { + *out_value = (ssize_t)(*((uint8_t*)sub_regions[idx].encl_addr)); + } else if (sub_regions[idx].size == sizeof(uint16_t)) { + *out_value = (ssize_t)(*((uint16_t*)sub_regions[idx].encl_addr)); + } else if (sub_regions[idx].size == sizeof(uint32_t)) { + *out_value = (ssize_t)(*((uint32_t*)sub_regions[idx].encl_addr)); + } else if (sub_regions[idx].size == sizeof(uint64_t)) { + *out_value = (ssize_t)(*((uint64_t*)sub_regions[idx].encl_addr)); + } else { + log_error("Invalid deep-copy syntax (deep-copy sub-entry '%s' must be of " + "legitimate size: 1, 2, 4 or 8 bytes)", sub_regions[idx].name); + return -PAL_ERROR_INVAL; + } + + return 0; + } + } + + return -PAL_ERROR_NOTDEFINED; +} + +/* caller sets `sub_regions_cnt_ptr` to maximum number of sub_regions; this variable is updated to + * return the number of actually used sub_regions */ +static int collect_sub_regions(toml_array_t* root_toml_array, void* root_encl_addr, + struct sub_region* sub_regions, size_t* sub_regions_cnt_ptr) { + int ret; + + assert(root_toml_array && toml_array_nelem(root_toml_array) > 0); + assert(sub_regions && sub_regions_cnt_ptr); + + size_t max_sub_regions = *sub_regions_cnt_ptr; + size_t sub_regions_cnt = 0; + + assert(get_tcb_trts()->ioctl_scratch_space); + struct mem_region* mem_regions = (struct mem_region*)get_tcb_trts()->ioctl_scratch_space; + mem_regions[0].toml_array = root_toml_array; + mem_regions[0].encl_addr = root_encl_addr; + mem_regions[0].adjacent = false; + size_t mem_regions_cnt = 1; + + /* collecting memory regions and their sub-regions must use breadth-first search to dynamically + * calculate sizes of sub-regions even if they are specified via another sub-region's "name" */ + char* cur_encl_addr = NULL; + size_t mem_region_idx = 0; + while (mem_region_idx < mem_regions_cnt) { + struct mem_region* cur_mem_region = &mem_regions[mem_region_idx]; + mem_region_idx++; + + if (!cur_mem_region->adjacent) + cur_encl_addr = cur_mem_region->encl_addr; + + size_t cur_mem_region_first_sub_region = sub_regions_cnt; + + for (size_t i = 0; i < (size_t)toml_array_nelem(cur_mem_region->toml_array); i++) { + toml_table_t* sub_region_info = toml_table_at(cur_mem_region->toml_array, i); + if (!sub_region_info) { + log_error("Invalid deep-copy syntax (each memory subregion must be a TOML " + "table)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + + if (sub_regions_cnt == max_sub_regions) { + log_error("Too many memory sub-regions in a deep-copy syntax (maximum " + "possible is %lu)", max_sub_regions); + ret = -PAL_ERROR_NOMEM; + goto out; + } + + struct sub_region* cur_sub_region = &sub_regions[sub_regions_cnt]; + sub_regions_cnt++; + + cur_sub_region->untrusted_addr = NULL; + cur_sub_region->mem_ptr = NULL; + + cur_sub_region->encl_addr = cur_encl_addr; + if (!cur_encl_addr || cur_encl_addr == (void*)-1) { + /* enclave address is invalid, user provided bad struct */ + ret = -PAL_ERROR_DENIED; + goto out; + } + + toml_raw_t sub_region_name_raw = toml_raw_in(sub_region_info, "name"); + toml_raw_t sub_region_type_raw = toml_raw_in(sub_region_info, "type"); + toml_raw_t sub_region_align_raw = toml_raw_in(sub_region_info, "align"); + toml_raw_t sub_region_size_raw = toml_raw_in(sub_region_info, "size"); + toml_raw_t sub_region_unit_raw = toml_raw_in(sub_region_info, "unit"); + toml_raw_t sub_region_adjust_raw = toml_raw_in(sub_region_info, "adjust"); + + toml_array_t* sub_region_ptr_arr = toml_array_in(sub_region_info, "ptr"); + if (!sub_region_ptr_arr) { + /* "ptr" to another sub-region doesn't use TOML's inline array syntax, maybe it is a + * reference to already-defined sub-region (e.g., `ptr = "my-struct"`) */ + toml_raw_t sub_region_ptr_raw = toml_raw_in(sub_region_info, "ptr"); + if (sub_region_ptr_raw) { + char* sub_region_name = NULL; + ret = toml_rtos(sub_region_ptr_raw, &sub_region_name); + if (ret < 0) { + log_error("Invalid deep-copy syntax ('ptr' of a deep-copy sub-entry " + "must be a TOML array or a string surrounded by double quotes)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + + if (!strcmp(sub_region_name, "this")) { + /* special case of `{ptr="this"}` -- ptr to object of the IOCTL root type */ + sub_region_ptr_arr = root_toml_array; + } else { + size_t idx; + ret = get_sub_region_idx(sub_regions, sub_regions_cnt, sub_region_name, + hash(sub_region_name), &idx); + if (ret < 0) { + log_error("Invalid deep-copy syntax (cannot find sub region '%s')", + sub_region_name); + free(sub_region_name); + goto out; + } + + assert(idx < sub_regions_cnt); + assert(sub_regions[idx].type == COPY_PTR_ENCLAVE); + assert(sub_regions[idx].mem_ptr); + sub_region_ptr_arr = sub_regions[idx].mem_ptr; + } + + free(sub_region_name); + } + } + + if (sub_region_align_raw && i != 0) { + log_error("Invalid deep-copy syntax ('align' may be specified only for the " + "first sub-region of the memory region)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + + if (sub_region_type_raw && sub_region_ptr_arr) { + log_error("Invalid deep-copy syntax ('ptr' sub-entries cannot specify " + "a 'type'; pointers are never copied directly but rewired)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + + cur_sub_region->name = NULL; + cur_sub_region->name_hash = 0; + if (sub_region_name_raw) { + ret = toml_rtos(sub_region_name_raw, &cur_sub_region->name); + if (ret < 0) { + log_error("Invalid deep-copy syntax ('name' of a deep-copy sub-entry " + "must be a TOML string surrounded by double quotes)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + cur_sub_region->name_hash = hash(cur_sub_region->name); + } + + cur_sub_region->type = COPY_NONE_ENCLAVE; + if (sub_region_type_raw) { + char* type_str = NULL; + ret = toml_rtos(sub_region_type_raw, &type_str); + if (ret < 0) { + log_error("Invalid deep-copy syntax ('type' of a deep-copy sub-entry " + "must be a TOML string surrounded by double quotes)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + + if (!strcmp(type_str, "out")) { + cur_sub_region->type = COPY_OUT_ENCLAVE; + } else if (!strcmp(type_str, "in")) { + cur_sub_region->type = COPY_IN_ENCLAVE; + } else if (!strcmp(type_str, "inout")) { + cur_sub_region->type = COPY_INOUT_ENCLAVE; + } else if (!strcmp(type_str, "none")) { + cur_sub_region->type = COPY_NONE_ENCLAVE; + } else { + log_error("Invalid deep-copy syntax ('type' of a deep-copy sub-entry " + "must be one of \"out\", \"in\", \"inout\" or \"none\")"); + free(type_str); + ret = -PAL_ERROR_INVAL; + goto out; + } + + free(type_str); + } + + cur_sub_region->align = 0; + if (sub_region_align_raw) { + ret = toml_rtoi(sub_region_align_raw, &cur_sub_region->align); + if (ret < 0 || cur_sub_region->align <= 0) { + log_error("Invalid deep-copy syntax ('align' of a deep-copy sub-entry " + "must be a positive number)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + } + + if (sub_region_ptr_arr) { + /* only set type for now, we postpone pointer/array handling for later */ + cur_sub_region->type = COPY_PTR_ENCLAVE; + cur_sub_region->mem_ptr = sub_region_ptr_arr; + } + + cur_sub_region->size = -1; + cur_sub_region->size_name = NULL; + cur_sub_region->size_name_hash = 0; + if (sub_region_size_raw) { + ret = toml_rtos(sub_region_size_raw, &cur_sub_region->size_name); + if (ret == 0) { + cur_sub_region->size_name_hash = hash(cur_sub_region->size_name); + + ssize_t val = -1; + /* "sub_regions_cnt - 1" is to exclude myself; do not fail if couldn't find + * (we will try later one more time) */ + ret = get_sub_region_value(sub_regions, sub_regions_cnt - 1, + cur_sub_region->size_name, + cur_sub_region->size_name_hash, &val); + if (ret < 0 && ret != -PAL_ERROR_NOTDEFINED) { + goto out; + } + cur_sub_region->size = val; + } else { + /* size is specified not as string (another sub-region's name), then must be + * specified explicitly as number of bytes */ + ret = toml_rtoi(sub_region_size_raw, &cur_sub_region->size); + if (ret < 0 || cur_sub_region->size <= 0) { + log_error("Invalid deep-copy syntax ('size' of a deep-copy " + "sub-entry must be a TOML string or a positive number)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + } + } + + cur_sub_region->unit = 1; /* 1 byte by default */ + if (sub_region_unit_raw) { + ret = toml_rtoi(sub_region_unit_raw, &cur_sub_region->unit); + if (ret < 0 || cur_sub_region->unit <= 0) { + log_error("Invalid deep-copy syntax ('unit' of a deep-copy sub-entry " + "must be a positive number)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + } + + cur_sub_region->adjust = 0; + if (sub_region_adjust_raw) { + ret = toml_rtoi(sub_region_adjust_raw, &cur_sub_region->adjust); + if (ret < 0) { + log_error("Invalid deep-copy syntax ('adjust' of a deep-copy sub-entry " + "is not a valid number)"); + ret = -PAL_ERROR_INVAL; + goto out; + } + } + + if (cur_sub_region->size >= 0) { + cur_sub_region->size *= cur_sub_region->unit; + cur_sub_region->size += cur_sub_region->adjust; + } + + if (cur_sub_region->type == COPY_PTR_ENCLAVE) { + cur_encl_addr += sizeof(void*); + } else { + assert(cur_sub_region->size >= 0); + cur_encl_addr += (uintptr_t)cur_sub_region->size; + } + } + + /* iterate through collected pointer/array sub regions and add corresponding mem regions */ + for (size_t i = cur_mem_region_first_sub_region; i < sub_regions_cnt; i++) { + if (sub_regions[i].type != COPY_PTR_ENCLAVE) + continue; + + if (sub_regions[i].size >= 0) { + /* sizes was found in the first swoop, nothing to do here */ + } else if (sub_regions[i].size < 0 && sub_regions[i].size_name) { + /* pointer/array size was not found in the first swoop, try again */ + ssize_t val = -1; + ret = get_sub_region_value(sub_regions, sub_regions_cnt, sub_regions[i].size_name, + sub_regions[i].size_name_hash, &val); + if (ret < 0) { + log_error("Invalid deep-copy syntax (cannot find sub region '%s')", + sub_regions[i].size_name); + goto out; + } + if (val < 0) { + log_error("Invalid deep-copy syntax (sub region '%s' has negative size)", + sub_regions[i].size_name); + ret = -PAL_ERROR_INVAL; + goto out; + } + sub_regions[i].size = val; + } else { + /* size is not specified at all for this sub region, assume it is 1 item */ + sub_regions[i].size = 1; + } + + for (size_t k = 0; k < (size_t)sub_regions[i].size; k++) { + if (mem_regions_cnt == MAX_MEM_REGIONS) { + log_error("Too many memory regions in a deep-copy syntax (maximum " + "possible is %d)", MAX_MEM_REGIONS); + ret = -PAL_ERROR_NOMEM; + goto out; + } + + void* mem_region_addr = *((void**)sub_regions[i].encl_addr); + if (!mem_region_addr) + continue; + + mem_regions[mem_regions_cnt].toml_array = sub_regions[i].mem_ptr; + mem_regions[mem_regions_cnt].encl_addr = mem_region_addr; + mem_regions[mem_regions_cnt].adjacent = k > 0; + mem_regions_cnt++; + } + + sub_regions[i].size = sizeof(void*); /* rewire to actual size of "ptr" sub-region */ + } + } + + *sub_regions_cnt_ptr = sub_regions_cnt; + ret = 0; +out: + for (size_t i = 0; i < sub_regions_cnt; i++) { + /* "name" fields are not needed after we collected all sub_regions */ + free(sub_regions[i].name); + free(sub_regions[i].size_name); + sub_regions[i].name = NULL; + sub_regions[i].size_name = NULL; + } + return ret; +} + +static void copy_sub_regions_to_untrusted(struct sub_region* sub_regions, size_t sub_regions_cnt, + void* untrusted_addr) { + char* cur_untrusted_addr = untrusted_addr; + for (size_t i = 0; i < sub_regions_cnt; i++) { + if (sub_regions[i].size <= 0 || !sub_regions[i].encl_addr) + continue; + + if (sub_regions[i].align > 0) { + char* aligned_untrusted_addr = ALIGN_UP_PTR(cur_untrusted_addr, sub_regions[i].align); + memset(cur_untrusted_addr, 0, aligned_untrusted_addr - cur_untrusted_addr); + cur_untrusted_addr = aligned_untrusted_addr; + } + + if (sub_regions[i].type == COPY_OUT_ENCLAVE || sub_regions[i].type == COPY_INOUT_ENCLAVE) { + memcpy(cur_untrusted_addr, sub_regions[i].encl_addr, sub_regions[i].size); + } else { + memset(cur_untrusted_addr, 0, sub_regions[i].size); + } + + sub_regions[i].untrusted_addr = cur_untrusted_addr; + cur_untrusted_addr += sub_regions[i].size; + } + + for (size_t i = 0; i < sub_regions_cnt; i++) { + if (sub_regions[i].size <= 0 || !sub_regions[i].encl_addr) + continue; + + if (sub_regions[i].type == COPY_PTR_ENCLAVE) { + void* encl_ptr_value = *((void**)sub_regions[i].encl_addr); + /* rewire pointer value in untrusted memory to a corresponding untrusted sub-region */ + for (size_t j = 0; j < sub_regions_cnt; j++) { + if (sub_regions[j].encl_addr == encl_ptr_value) { + *((void**)sub_regions[i].untrusted_addr) = sub_regions[j].untrusted_addr; + break; + } + } + } + } +} + +static void copy_sub_regions_to_enclave(struct sub_region* sub_regions, size_t sub_regions_cnt) { + for (size_t i = 0; i < sub_regions_cnt; i++) { + if (sub_regions[i].size <= 0 || !sub_regions[i].encl_addr) + continue; + + if (sub_regions[i].type == COPY_IN_ENCLAVE || sub_regions[i].type == COPY_INOUT_ENCLAVE) + memcpy(sub_regions[i].encl_addr, sub_regions[i].untrusted_addr, sub_regions[i].size); + } +} + +static int get_ioctl_struct(uint32_t cmd, toml_array_t** out_toml_ioctl_struct) { + int ret; + + /* find this IOCTL request in the manifest */ + toml_table_t* manifest_sgx = toml_table_in(g_pal_public_state.manifest_root, "sgx"); + if (!manifest_sgx) + return -PAL_ERROR_NOTIMPLEMENTED; + + toml_table_t* toml_allowed_ioctls = toml_table_in(manifest_sgx, "allowed_ioctls"); + if (!toml_allowed_ioctls) + return -PAL_ERROR_NOTIMPLEMENTED; + + ssize_t toml_allowed_ioctls_cnt = toml_table_ntab(toml_allowed_ioctls); + if (toml_allowed_ioctls_cnt <= 0) + return -PAL_ERROR_NOTIMPLEMENTED; + + for (ssize_t idx = 0; idx < toml_allowed_ioctls_cnt; idx++) { + const char* toml_allowed_ioctl_key = toml_key_in(toml_allowed_ioctls, idx); + assert(toml_allowed_ioctl_key); + + toml_table_t* toml_ioctl_table = toml_table_in(toml_allowed_ioctls, toml_allowed_ioctl_key); + if (!toml_ioctl_table) + continue; + + toml_raw_t toml_ioctl_request_raw = toml_raw_in(toml_ioctl_table, "request"); + if (!toml_ioctl_request_raw) + continue; + + int64_t ioctl_request = 0x0; + ret = toml_rtoi(toml_ioctl_request_raw, &ioctl_request); + if (ret < 0 || ioctl_request == 0x0) { + log_error("Invalid request value of allowed ioctl '%s' in manifest", + toml_allowed_ioctl_key); + continue; + } + + if (ioctl_request == (int64_t)cmd) { + /* found this IOCTL request in the manifest, now must find the corresponding struct */ + toml_raw_t toml_ioctl_struct_raw = toml_raw_in(toml_ioctl_table, "struct"); + if (!toml_ioctl_struct_raw) { + log_error("Cannot find struct value of allowed ioctl '%s' in manifest", + toml_allowed_ioctl_key); + return -PAL_ERROR_NOTIMPLEMENTED; + } + + char* ioctl_struct_str = NULL; + ret = toml_rtos(toml_ioctl_struct_raw, &ioctl_struct_str); + if (ret < 0) { + log_error("Invalid struct value of allowed ioctl '%s' in manifest " + "(sgx.allowed_ioctls.[identifier].struct must be a TOML string)", + toml_allowed_ioctl_key); + return -PAL_ERROR_INVAL; + } + + toml_table_t* toml_ioctl_structs = toml_table_in(manifest_sgx, "ioctl_structs"); + if (!toml_ioctl_structs) { + log_error("There are no ioctl structs found in manifest"); + free(ioctl_struct_str); + return -PAL_ERROR_INVAL; + } + + toml_array_t* toml_ioctl_struct = toml_array_in(toml_ioctl_structs, ioctl_struct_str); + if (!toml_ioctl_struct) { + log_error("Cannot find struct value '%s' of allowed ioctl '%s' in " + "manifest (or it is not a correctly formatted TOML array)", + ioctl_struct_str, toml_allowed_ioctl_key); + free(ioctl_struct_str); + return -PAL_ERROR_INVAL; + } + free(ioctl_struct_str); + + *out_toml_ioctl_struct = toml_ioctl_struct; + return 0; + } + } + + return -PAL_ERROR_NOTIMPLEMENTED; +} + +/* + * Thread-local scratch space for IOCTL internal data: + * 1. Memregions array of size MAX_MEM_REGIONS + + * 2. Subregions array of size MAX_SUB_REGIONS + * + * Note that this scratch space is allocated once per thread and never freed. Also, we assume that + * IOCTLs during signal handling are impossible, so there is no need to protect via atomic variable + * like `ocall_mmap_untrusted_cache: in_use`. + */ +static int init_ioctl_scratch_space(void) { + if (get_tcb_trts()->ioctl_scratch_space) + return 0; + + size_t total_size = MAX_MEM_REGIONS * sizeof(struct mem_region) + + MAX_SUB_REGIONS * sizeof(struct sub_region); + void* scratch_space = calloc(1, total_size); + if (!scratch_space) + return -PAL_ERROR_NOMEM; + + get_tcb_trts()->ioctl_scratch_space = scratch_space; + return 0; +} + +int _PalDeviceIoControl(PAL_HANDLE handle, uint32_t cmd, uint64_t arg, int* out_ret) { + int ret; + + if (handle->hdr.type != PAL_TYPE_DEV) + return -PAL_ERROR_INVAL; + + if (handle->dev.fd == PAL_IDX_POISON) + return -PAL_ERROR_DENIED; + + ret = init_ioctl_scratch_space(); + if (ret < 0) + return ret; + + toml_array_t* toml_ioctl_struct = NULL; + ret = get_ioctl_struct(cmd, &toml_ioctl_struct); + if (ret < 0) + return ret; + + if (toml_array_nelem(toml_ioctl_struct) == 0) { + /* special case of an empty TOML array -> base-type or ignored IOCTL argument */ + ret = ocall_ioctl(handle->dev.fd, cmd, arg); + if (ret < 0) + return unix_to_pal_error(ret); + + *out_ret = ret; + return 0; + } + + size_t sub_regions_cnt = MAX_SUB_REGIONS; + struct sub_region* sub_regions = (struct sub_region*)get_tcb_trts()->ioctl_scratch_space + + MAX_MEM_REGIONS * sizeof(struct mem_region); + + /* typical IOCTL case: deep-copy the IOCTL argument's input data outside of enclave, execute the + * IOCTL OCALL, and deep-copy the IOCTL argument's output data back into enclave */ + ret = collect_sub_regions(toml_ioctl_struct, (void*)arg, sub_regions, &sub_regions_cnt); + if (ret < 0) + return ret; + + void* untrusted_addr = NULL; + size_t untrusted_size = 0; + for (size_t i = 0; i < sub_regions_cnt; i++) + untrusted_size += sub_regions[i].size + sub_regions[i].align; + + ret = ocall_mmap_untrusted(&untrusted_addr, ALLOC_ALIGN_UP(untrusted_size), + PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, /*fd=*/-1, + /*offset=*/0); + if (ret < 0) + return unix_to_pal_error(ret); + + assert(untrusted_addr); + copy_sub_regions_to_untrusted(sub_regions, sub_regions_cnt, untrusted_addr); + + ret = ocall_ioctl(handle->dev.fd, cmd, (uint64_t)untrusted_addr); + if (ret < 0) { + ocall_munmap_untrusted(untrusted_addr, ALLOC_ALIGN_UP(untrusted_size)); + return unix_to_pal_error(ret); + } + + copy_sub_regions_to_enclave(sub_regions, sub_regions_cnt); + + ocall_munmap_untrusted(untrusted_addr, ALLOC_ALIGN_UP(untrusted_size)); + + *out_ret = ret; + return 0; +} diff --git a/pal/src/host/linux-sgx/pal_ocall_types.h b/pal/src/host/linux-sgx/pal_ocall_types.h index 6acb0cca03..f6fbc17a5d 100644 --- a/pal/src/host/linux-sgx/pal_ocall_types.h +++ b/pal/src/host/linux-sgx/pal_ocall_types.h @@ -68,6 +68,7 @@ enum { OCALL_DEBUG_MAP_REMOVE, OCALL_DEBUG_DESCRIBE_LOCATION, OCALL_EVENTFD, + OCALL_IOCTL, OCALL_GET_QUOTE, OCALL_NR, }; @@ -322,6 +323,12 @@ typedef struct { int ms_flags; } ms_ocall_eventfd_t; +typedef struct { + int ms_fd; + unsigned int ms_cmd; + unsigned long ms_arg; +} ms_ocall_ioctl_t; + typedef struct { bool ms_is_epid; sgx_spid_t ms_spid; diff --git a/pal/src/host/linux-sgx/pal_tls.h b/pal/src/host/linux-sgx/pal_tls.h index a7c341bd72..f0b48ec854 100644 --- a/pal/src/host/linux-sgx/pal_tls.h +++ b/pal/src/host/linux-sgx/pal_tls.h @@ -47,6 +47,7 @@ struct enclave_tls { void* heap_min; void* heap_max; int* clear_child_tid; + void* ioctl_scratch_space; struct untrusted_area untrusted_area_cache; }; diff --git a/pal/src/host/linux/pal_devices.c b/pal/src/host/linux/pal_devices.c index 0a65596781..ab5e4e5007 100644 --- a/pal/src/host/linux/pal_devices.c +++ b/pal/src/host/linux/pal_devices.c @@ -208,3 +208,18 @@ struct handle_ops g_dev_ops = { .attrquery = &dev_attrquery, .attrquerybyhdl = &dev_attrquerybyhdl, }; + +int _PalDeviceIoControl(PAL_HANDLE handle, uint32_t cmd, uint64_t arg, int* out_ret) { + if (handle->hdr.type != PAL_TYPE_DEV) + return -PAL_ERROR_INVAL; + + if (handle->dev.fd == PAL_IDX_POISON) + return -PAL_ERROR_DENIED; + + int ret = DO_SYSCALL(ioctl, handle->dev.fd, cmd, arg); + if (ret < 0) + return unix_to_pal_error(ret); + + *out_ret = ret; + return 0; +} diff --git a/pal/src/host/skeleton/pal_devices.c b/pal/src/host/skeleton/pal_devices.c index 418b7c222a..e4e8933085 100644 --- a/pal/src/host/skeleton/pal_devices.c +++ b/pal/src/host/skeleton/pal_devices.c @@ -55,3 +55,7 @@ struct handle_ops g_dev_ops = { .attrquery = &dev_attrquery, .attrquerybyhdl = &dev_attrquerybyhdl, }; + +int _PalDeviceIoControl(PAL_HANDLE handle, uint32_t cmd, uint64_t arg, int* out_ret) { + return -PAL_ERROR_NOTIMPLEMENTED; +} diff --git a/pal/src/pal_misc.c b/pal/src/pal_misc.c index b78e813f45..e25f5766f0 100644 --- a/pal/src/pal_misc.c +++ b/pal/src/pal_misc.c @@ -30,6 +30,10 @@ size_t PalMemoryAvailableQuota(void) { return _PalMemoryAvailableQuota(); } +int PalDeviceIoControl(PAL_HANDLE handle, uint32_t cmd, uint64_t arg, int* out_ret) { + return _PalDeviceIoControl(handle, cmd, arg, out_ret); +} + #if defined(__x86_64__) int PalCpuIdRetrieve(uint32_t leaf, uint32_t subleaf, uint32_t values[4]) { return _PalCpuIdRetrieve(leaf, subleaf, values); diff --git a/pal/src/pal_symbols b/pal/src/pal_symbols index 60faef0ceb..cf93242d10 100644 --- a/pal/src/pal_symbols +++ b/pal/src/pal_symbols @@ -45,6 +45,7 @@ PalSegmentBaseSet PalStreamChangeName PalStreamAttributesSetByHandle PalMemoryAvailableQuota +PalDeviceIoControl PalDebugMapAdd PalDebugMapRemove PalDebugDescribeLocation