-
Notifications
You must be signed in to change notification settings - Fork 165
Conversation
@tyler there are a few things I'd like to add to the public API surface and documentation to go with this. It would probably be easiest if I did that on a branch and then PRed into this, is that alright with you? |
AddrLocation::Globals | AddrLocation::SigStack => { | ||
tracing::error!("UFFD pagefault at unexpected location: {:?}", loc); | ||
uffd.wake(fault_page as *mut c_void, host_page_size()) | ||
.map_err(|e| Error::InternalError(e.into()))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the things I've been seeing in the mmap-related PRs is a re-examination of errors like these. If there had been a UFFD pagefault at an unexpected location, should the lucet runtime return its generic "InternalError" or should it just panic?
See related PR: #486
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good call, it should just panic. I think @acfoltzer wanted to take care of that in a second PR. Though tbh I think we should just do it here.
// zero the sigstack | ||
(slot.sigstack, limits.signal_stack_size), | ||
] | ||
.into_iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am new to userfaultfd, and I'm wondering about a small difference between its implementation of new_instance_with and the mmap implementation.
lucet/lucet-runtime/lucet-runtime-internals/src/region/mmap.rs
Lines 121 to 122 in 83fd224
// make the stack read/writable | |
(slot.stack, limits.stack_size), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yeah, that's intentional here. We're making the allocation of the stack lazy. If you look at the other lines changed in that commit, you'll see where I make page faults in the Stack area zero out the page on demand.
That said, it does point to a lack of documentation here that I should fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some docs!
.map_err(|e| Error::InternalError(e.into()))?, | ||
); | ||
|
||
// map the chunk of virtual memory for all of the slots |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing the answer to my question could be here. The memory is already marked as READ/WRITE for all the slots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lucet/lucet-runtime/lucet-runtime-internals/src/region/uffd.rs
Lines 136 to 139 in e7fe5f1
AddrLocation::Stack => unsafe { | |
uffd.zeropage(fault_page as *mut c_void, host_page_size(), true) | |
.map_err(|e| Error::InternalError(e.into()))?; | |
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. If I'm reading that correctly, does that mean that the stack isn't zeroed out until it's actually needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may help to refer to the ioctl_userfaultfd(2)
man page on this, since "zeropage" does a lot more than what it sounds like: http://man7.org/linux/man-pages/man2/ioctl_userfaultfd.2.html (search for UFFDIO_ZEROPAGE
on that page).
Basically zeropage
is one of several ways that the uffd worker thread can resolve a page fault in the memory that it has registered. The stack starts off as mprotect
read/write but madvise
don't need, so accesses will trigger a uffd fault. We then recognize that the fault address is in the stack, and then call zeropage
to both zero out the memory and resolve the fault, so that the instance's thread wakes up and can make progress once again.
6e60af6
to
fc3049a
Compare
Now that CircleCI is running mmap and userfaultfd tests, will we be able to merge this? |
CircleCI experiments are currently only happening on branches, but I'm hoping to get a proper PR together today. |
So excited! |
…rting with entrypoint_test.
… a fork-related bug.
- Instantiates the suite in `lucet_runtime_internals::alloc::tests` for `UffdRegion - Fixes early return issues in `UffdRegion` similar to #455 - Adds a test to show that the per-instance heap limit applies to runtime expansions, not just initial instantiation - Refactors `validate_runtime_spec` to take the per-instance heap limit as an additional argument. This centralizes the logic for rejecting initially-oversized heap limits, and makes it clearer what's happening in each region's instantiation logic. - Removes the `UffdRegion`'s assertion that signal stack size is a multiple of page size. Since the user can now control this as a parameter, we reject it gracefully when validating `Limits` rather than panicking.
Leaving the question of errors in the handler alone for this commit, since that'll be a more major change.
Notably, this should get us building and running uffd in Linux CI. It turns out to be a tremendous pain to enable a feature flag for just one crate within a workspace. The situation is [being addressed][1], but in the meantime I believe the best route forward is to just have uffd on by default for Linux. [1]: rust-lang/cargo#5364
fc3049a
to
a96cb55
Compare
… should handle it gracefully, rather than imploding.
Ready for review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve approve approve!
The GitHub Actions tests are failing since they can't run the userfaultfd tests. I thought I'd try a little PR to have the offensive tests get skipped by GitHub Actions, since they are being run by CircleCI. See: #528 |
This pull request integrates Fastly's
userfaultfd
based memory management backend as an alternate Region implementation.Userfaultfd is a Linux-specific mechanism for handling page faults within userspace. (See https://www.kernel.org/doc/html/latest/admin-guide/mm/userfaultfd.html for a good technical explanation of it.) For Lucet, the use case is that when an Instance is started, none of the linear memory has to be copied initially.
We register the entire region with userfaultfd. When an instance is started, we set up the stacks, metadata, etc as normal, but we leave the pages of the linear memory "missing". When the instance starts and tries to access one of the linear memory pages, this triggers a page fault, as there is no physical memory backing the virtual memory. Since the region is registered with userfaultfd, this triggers a message to be sent to the userfaultfd handler thread. The handler thread determines which instance and module has faulted, and copies the necessary memory into place, before reawakening the instance thread.
At its core what this does is provide a way of reducing startup time and increasing flexibility in how memory is handled at the cost of increased runtime overhead (by was of context switches especially).
(All credit for this actually goes to @acfoltzer.)