-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mmap fails for empty files on Windows #72
Comments
Can you expand on why you would prefer that? The behavior of returning an error is consistent on Windows and Linux, and is even checked in the test suite. |
While the current behavior is consistent, I think almost all callers are going to need special handling for the zero-length case to avoid this error, and as a result there are probably a lot of latent bugs out there waiting for someone to trigger them with an empty file. I just realized I have one in my own code, where I check for a number of conditions but forgot to check for this one. Maybe relevant, there are cases today where Mmap can accept a zero length and successfully deref to an empty slice. It can happen if you have an offset that isn't a multiple of the page size, so that the real mapping under the covers isn't empty. So in that particular case, the caller might end up with an even trickier bug, where their empty maps fail only for certain offsets. (This is especially tough because an "is the file empty" check isn't good enough here. You have to know how your request offset compares to the file length.) |
That's a very convincing argument, based on that line of reasoning I'd accept a PR to make the behavior consistent with explicit checks. Should this be considered a backwards incompatible change (and require a 0.8 version bump?) |
@danburkert I've generally considered the transformation of an error into a non-error to not be a breaking change---and have even designed APIs around that assumption to allow backwards compatible expansion in the future. I'm trying to think of any special considerations that might be in play here. The only thing I can think of is virtual files on Linux. My goto example is
So in that sense, if memory mapping virtual files now "works" by claiming it is an empty file, then you might consider that a breaking change. Tangentially, it does also seem less than ideal for opening such files to appear as if they succeed. If they do, then callers can't actually remove the zero length check anyway. @oconnor663 I've generally found it more robust to just call |
Wow this is dirtier than I imagined :| And trying to distinguish virtual files based on the device type sounds like it would invite a lot of portability issues and platform-specific code that Would anyone consider this an acceptable strategy (very low confidence suggestion here): "If I can already think of one case that would break this logic, which would be to open a |
@oconnor663 Yeah, that does not sound pleasant. The extra syscall there does not seem good. AIUI, zero length memory maps are an error reported by the syscall itself, right? If so, then I kind of feel like we should just surface that as is done today I think. |
Why so complicated? Just handle the case where the len parameter to mmap is 0, i.e. just change these lines: Lines 40 to 46 in 3b047cc
In this case you could safely do something like: return Ok(MmapInner {
ptr: std::ptr::null_mut(),
len: 0,
}); Handling of special virtual files that are of length zero but still give some bytes during |
The implementation is not what's complicated. Deciding the behavior in the first place is what's complicated.
This is kind of just dismissing the point. The idea, as far as I can tell, was to reduce the case analysis the caller is required to perform in order to use mmap correctly. But this just winds up shifting the case analysis rather than removing it; and moreover, is arguably a breaking change. |
I see the difficulty here but I think that proper detection and handling of unmappable files should be left to the user. |
That's the case today. From
|
According to the man pages the standard way to check for filesystems without mmap support seems to be to call mmap on these and see if it fails with
I would prefer option 2 even though it is a breaking change. References:
From the POSIX Programmer's Manual:
|
This doesn't work well for programs that traverse the entire file hierarchy, like It seems like the underlying APIs are pretty tightly coupled to the notion that memmapping empty files is an error, and that callers will fall back to normal reading in the face of errors. (Or perhaps that callers won't try to mmap below a certain size.) If we want to make empty files not an error, it's suddenly our responsibility to detect when the filesystem is lying to use about a file's size. And as the old saying goes: If the file system wants to lie to us, then let the file system lie to us. |
Mapping empty files isn't an error, you can safely map an empty file as long as the mapping is larger than the file. You can not write behind the files actual end, but mapping is still possible and once you increase the file size writing becomes safe.
And the file system is also not really lying to us (this may look like a detail but is important): For example the procfs reports size 0 for /proc/cpuinfo. This is because the "file" does not actually take up any size anywhere - it is dynamically generated while read() ing it. Therefore mmap is not possible on this file. There are other files in /proc, for example you might find A file in Linux is simply just a handle to a certain resource, in the case of /proc/cpuinfo a handle towards cpu information. The actual resource defines how you may access it - in this case only via the read() syscall. So: mapping a 0-sized file is fine as long as the mapping is larger than 0. Mapping unmappable files (/proc/cpuinfo is likely such a file, but I haven't tried yet) will yield an error. A file is unmappable if the filesystem decides not to implement mmap at all or if the filesystem decides that this specific file is not mappable. |
Well, while thinking about it, this offers another possible solution: Always internally map at least one page of memory, even when the file is actually smaller. If the mmap returns an error here, we know that a file is not mappable, if it succeeds, we are fine and can still give an empty slice to the user. |
Oh I like that idea. Are there any cases where |
Yes, for example many of the device files in |
What's the Right Way to mmap a device like that? How do you know how many bytes to map? |
AFAIK you get the size of the underlying storage via ioctl calls. But to make those calls you need the apropriate access rights. I recommend that you don't think of device files like regular files. Only because some of them behave a lot like regular files, most of them won't and will still be mappable. For example imagine a CD changer. It may have multiple CDs inside, and you can set the active one through ioctl calls. The size of the device is unclear or can change at any time. So the device file really does not have a size but instead the underlying hardware may have one or more sizes. The file is just an abstraction over it. |
Fixes danburkert#72 Previously, mapping an empty file or requesting a zero-size mapping would return an error because underlying OS APIs do not support zero-size mappings. Now, this crate instead requests a one-byte mapping (which most lilkely ends up rounded up to one memory page) but still has `Deref` return an empty slice in those cases. This is the behavior suggested for Unix in danburkert#72 (comment) https://www.man7.org/linux/man-pages/man2/mmap.2.html On Windows, docs for `CreateFileMappingW` say: https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-createfilemappingw > `dwMaximumSizeHigh` > > The high-order DWORD of the maximum size of the file mapping object. > > `dwMaximumSizeLow` > > The low-order DWORD of the maximum size of the file mapping object. > > If this parameter and dwMaximumSizeHigh are 0 (zero), the maximum size of > the file mapping object is equal to the current size of the file that hFile > identifies. > > An attempt to map a file with a length of 0 (zero) fails with an error code > of ERROR_FILE_INVALID. Applications should test for files with a length of > 0 (zero) and reject those files. Hopefully that last paragraph only applies when `dwMaximumSize` is zero (which this PR avoids), but that is not clear to me.
Fixes danburkert#72 Previously, mapping an empty file or requesting a zero-size mapping would return an error because underlying OS APIs do not support zero-size mappings. Now, this crate instead requests a one-byte mapping (which most lilkely ends up rounded up to one memory page) but still has `Deref` return an empty slice in those cases. This is the behavior suggested for Unix in danburkert#72 (comment) On Windows, docs for `CreateFileMappingW` say: https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-createfilemappingw > `dwMaximumSizeHigh` > > The high-order DWORD of the maximum size of the file mapping object. > > `dwMaximumSizeLow` > > The low-order DWORD of the maximum size of the file mapping object. > > If this parameter and dwMaximumSizeHigh are 0 (zero), the maximum size of > the file mapping object is equal to the current size of the file that hFile > identifies. > > An attempt to map a file with a length of 0 (zero) fails with an error code > of ERROR_FILE_INVALID. Applications should test for files with a length of > 0 (zero) and reject those files. Hopefully that last paragraph only applies when `dwMaximumSize` is zero (which this PR avoids), but that is not clear to me.
I’ve implemented this approach in RazrFalcon#25. Unfortunately the same idea does not seem to work on Windows: RazrFalcon#25 (comment)
With my PR on my Linux machine, trying to map |
Fixes danburkert#72 Previously, mapping an empty file or requesting a zero-size mapping would return an error because underlying OS APIs do not support zero-size mappings. Now, this crate instead requests a one-byte mapping (which most lilkely ends up rounded up to one memory page) but still has `Deref` return an empty slice in those cases. This is the behavior suggested for Unix in danburkert#72 (comment) On Windows, docs for `CreateFileMappingW` say: https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-createfilemappingw > `dwMaximumSizeHigh` > > The high-order DWORD of the maximum size of the file mapping object. > > `dwMaximumSizeLow` > > The low-order DWORD of the maximum size of the file mapping object. > > If this parameter and dwMaximumSizeHigh are 0 (zero), the maximum size of > the file mapping object is equal to the current size of the file that hFile > identifies. > > An attempt to map a file with a length of 0 (zero) fails with an error code > of ERROR_FILE_INVALID. Applications should test for files with a length of > 0 (zero) and reject those files. Hopefully that last paragraph only applies when `dwMaximumSize` is zero (which this PR avoids), but that is not clear to me.
@faulesocke This happens to work with RazrFalcon#25 on my machine with these particular kernel and glibc versions, but do you know of documentation that supports this pattern as a correct use of (All that I find is discussion of the bytes between the end of the file and the next page boundary, but zero is already page-aligned so not relevant here.) Python’s |
Ah! I think I found it in the POSIX definition: https://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html
It goes on:
So reading from or writing to the area mapped past the end of the file can cause SIGBUS but the existence of that area is fine. |
See <danburkert/memmap-rs#72>. Unix can handle an mmap to an empty file, but Windows cannot.
See <danburkert/memmap-rs#72>. Unix can handle an mmap to an empty file, but Windows cannot.
See <danburkert/memmap-rs#72>. Unix can handle an mmap to an empty file, but Windows cannot.
As per the MSDN documentation for CreateFileMapping:
An attempt to map a file with a length of 0 (zero) fails with an error code of ERROR_FILE_INVALID. Applications should test for files with a length of 0 (zero) and reject those files.
I believe this should not be the expected behavior for the
memmap-rs
crate, and we should check if the size is0
(we already get the size before mapping the file anyway), and return an empty-slice backed Mmap struct in that case.The text was updated successfully, but these errors were encountered: