From e1f2802ea4bfde6fe00d2d815303392b11a262eb Mon Sep 17 00:00:00 2001 From: Gabriella Gonzalez Date: Thu, 16 Feb 2023 14:49:58 -0800 Subject: [PATCH] Fix `ld`'s response file support for special files (part 2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This is a follow-up to: https://github.com/tpoechtrager/cctools-port/pull/131 … which introduced a bug: the `st_size` field does not necessarily accurately represent the length if the file is a special file. For example, on macOS the `st_size` field appears to empirically be no larger than 64 KB for special files (presumably the size of some buffer), even if the actual input is larger than 64 KB. As a result, `ld` was truncating the input arguments to 64 KB for large response files, which defeats the purpose of response files (since they're typically used to support large command lines). More generally, this issue isn't specific to `fstat`, but rather appears to be an issue with anything that uses a file descriptor returned by `open`. For example, `read` misbehaves in the exact same way and refuses to read more than 64 KB from a special file that was opened by `open` even if you try to repeatedly `read` from the file to completion. For these special files, we don't have a good way to ascertain the length of the input because `fstat` won't work (that only works on file descriptors returned by `open`, which misbehave on special files), nor can we seek to the end to determine the length (because special files might not support rewinding the input) so the first part of this fix is to simply read from the input to the end and see how many bytes receive. This means that we can't allocate all the necessary memory we require up front and instead we need to dynamically resize the argument array as we read. The second part of this solution is to use `fopen` / `fread` / `close` on the unhappy path when `mmap` fails instead of using `open` / `read` / `close` since the latter operations misbehave on special files. --- cctools/ld64/src/ld/ResponseFiles.cpp | 51 ++++++++++++++++++++------- 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/cctools/ld64/src/ld/ResponseFiles.cpp b/cctools/ld64/src/ld/ResponseFiles.cpp index 2b536b6e..1b321b79 100644 --- a/cctools/ld64/src/ld/ResponseFiles.cpp +++ b/cctools/ld64/src/ld/ResponseFiles.cpp @@ -209,21 +209,11 @@ struct string_list* at_paths, int *hint_p) } char* addr = NULL; - size_t size = sb.st_size; - bool used_mmap = true; + bool used_mmap = false; if (sb.st_size) { addr = (char*)mmap(0, (size_t)sb.st_size, PROT_READ | PROT_WRITE, MAP_FILE | MAP_PRIVATE, fd, 0); - if (MAP_FAILED == addr) { - used_mmap = false; - addr = (char *)malloc(size + 1); - if (-1 == read(fd, addr, size)) { - close(fd); - throwf("can't read %s: %s\n", at_path, strerror(errno)); - return EXPAND_ERROR; - } - addr[size] = '\0'; - } + used_mmap = MAP_FAILED != addr; } if (close(fd)) { @@ -233,6 +223,43 @@ struct string_list* at_paths, int *hint_p) return EXPAND_ERROR; } + // if mmap fails then the input file is likely a special file (e.g. a + // pipe) that doesn't necessarily support st_size or some other method to + // ascertain the input size, so we fall back to allocating as we go + if (!used_mmap) { + size_t size = 4096; + size_t total_read = 0; + size_t num_read; + size_t request;; + + FILE *fp = fopen(at_path, "rb"); + + addr = (char *)malloc(size + 1); + + while ( + request = size - total_read, + num_read = fread(addr + total_read, 1, request, fp), + total_read += num_read, + request == num_read + ) { + size *= 2; + addr = (char*)realloc(addr, size + 1); + } + + if (-1 == num_read) { + close(fd); + throwf("can't read %s: %s\n", at_path, strerror(errno)); + return EXPAND_ERROR; + } + + addr[total_read] = '\0'; + + if (fclose(fp)) { + throwf("can't close %s: %s\n", at_path, strerror(errno)); + return EXPAND_ERROR; + } + } + // build a new argument list now if (0 == newargs.nstr) { string_list_add_argv(&newargs, i, args->strs);