Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WASI journal and stateful persistence #4263

Merged
merged 137 commits into from
Jan 4, 2024
Merged
Show file tree
Hide file tree
Changes from 127 commits
Commits
Show all changes
137 commits
Select commit Hold shift + click to select a range
6d407eb
Reduced verbosity on a warning
john-sharratt Oct 18, 2023
523add3
Added the draft API and CLI changes needed for the snapshot functiona…
john-sharratt Oct 20, 2023
6940ba8
Added more snapshot trigger types
john-sharratt Oct 20, 2023
6d04848
Puts the snapshot functionality behind a feature flag
john-sharratt Oct 20, 2023
c1872fd
Added an example on how certain events will be loggin to the snap sho…
john-sharratt Oct 20, 2023
ec74b83
Added the basics of the snapshot mechanism
john-sharratt Oct 22, 2023
8f11ac7
Addressed comments and added a basic doc
john-sharratt Oct 22, 2023
99ecf2b
Added compactor and filter snapshot capturer
john-sharratt Oct 23, 2023
2a9e147
Added missing snapshot-on type
john-sharratt Oct 23, 2023
85b12b1
Added an init log entry which has the WASM hash
john-sharratt Oct 23, 2023
1fe4303
Added the remaining data structures for the snapshot journal
john-sharratt Oct 23, 2023
ebfe824
Cleaned up type complexity using WasiResult
john-sharratt Oct 24, 2023
34f233b
Wired up the snapshot triggers
john-sharratt Oct 24, 2023
b829407
Runtime will now snapshot when the STDIN trigger is used
john-sharratt Oct 24, 2023
19c3946
Moved the snapshot trigger list from the process to WasiEnv which is …
john-sharratt Oct 24, 2023
31c4a61
Refactored the snapshot feature toggle
john-sharratt Oct 24, 2023
24368b5
More wiring of the snapshot options and the intercepts
john-sharratt Oct 24, 2023
e299170
Added a snapshot trigger on reading the environment variables
john-sharratt Oct 24, 2023
63980d8
Minor document updates
john-sharratt Oct 24, 2023
ffeeb33
Process will now properly terminate when snapshots fail
john-sharratt Oct 24, 2023
34fa6ea
Process will now properly terminate when snapshots fail
john-sharratt Oct 24, 2023
485176b
Simplified the checkpoint loop
john-sharratt Oct 24, 2023
6e557ba
The runtime now properly saves the journal events to a snapshot file
john-sharratt Oct 25, 2023
c3f0e52
Now currently saving the snapshot state
john-sharratt Oct 25, 2023
27dcb1d
Now currently saving the snapshot state
john-sharratt Oct 25, 2023
a99b840
Now currently saving the snapshot state
john-sharratt Oct 25, 2023
e320aee
The rewinding is now fixed again for snapshots
john-sharratt Oct 25, 2023
46aa873
Finished a basic version of the snapshot restoration
john-sharratt Oct 25, 2023
c519022
Added more snapshot events for the file system
john-sharratt Nov 10, 2023
3efa30e
Added intercept and replay calls for all the major file system operat…
john-sharratt Nov 11, 2023
7cef370
Merge remote-tracking branch 'origin/master' into dcgi
john-sharratt Nov 11, 2023
397daca
Merge remote-tracking branch 'origin/master' into dcgi
john-sharratt Nov 18, 2023
f167eb9
cargo fmt
john-sharratt Nov 19, 2023
7d023ea
Fixed compile errors without the snapshot feature
john-sharratt Nov 19, 2023
1223d37
Fixed compile errors without the snapshot feature
john-sharratt Nov 19, 2023
4aa1ec5
More feature toggle fixes
john-sharratt Nov 19, 2023
ccc4c29
Refactored the snapshot to journal to make it clearer how it works
john-sharratt Nov 19, 2023
4906460
Updated the journal documentation file
john-sharratt Nov 19, 2023
00fc6f6
More wiring of the journal snapshot triggers
john-sharratt Nov 20, 2023
8337e72
Clippy fixes
john-sharratt Nov 20, 2023
687667d
Fixed the JSC compile errors
john-sharratt Nov 20, 2023
897b8d7
Added a panic journal event
john-sharratt Nov 20, 2023
2f4baeb
More clippy fixes
john-sharratt Nov 20, 2023
9cd4c44
More cleanup of the CLI and naming
john-sharratt Nov 20, 2023
305fc4c
Renamed the journaling submodule
john-sharratt Nov 20, 2023
ffad63c
Update of gramma
john-sharratt Nov 20, 2023
cbeff30
Fix for a linting issue
john-sharratt Nov 20, 2023
e36b898
Added an event for process exit and a module hash to validate the sam…
john-sharratt Nov 20, 2023
85d4b6d
Fixed some linting issues
john-sharratt Nov 20, 2023
0dda34c
Fixed some unit tests
john-sharratt Nov 20, 2023
b539572
Fixed a clippy error
john-sharratt Nov 20, 2023
e80ddb4
Fix for some unit tests
john-sharratt Nov 20, 2023
e6e3478
Made the run commands backwards compatible
john-sharratt Nov 20, 2023
a4bc071
Undid the example refactor
john-sharratt Nov 20, 2023
8a9d5a8
Added events for ports and sockets
john-sharratt Nov 20, 2023
61c6825
Split the port and socket syscalls so they can be invokes seperately
john-sharratt Nov 20, 2023
e53e11c
Added the socket syscall hooks
john-sharratt Nov 21, 2023
8af77a2
Finished off the sockets for journaling
john-sharratt Nov 21, 2023
37d6c57
Fixed some linting issues
john-sharratt Nov 21, 2023
74c1d31
Fixed a compile issue
john-sharratt Nov 21, 2023
68604f1
More linting fixes
john-sharratt Nov 21, 2023
f852722
Fixed a regression issue on the socket binding
john-sharratt Nov 21, 2023
00a8873
The journal entries can now support zero copy operations
john-sharratt Nov 21, 2023
a57f30d
Switched to rkyv for the serialization of the journal so that reads a…
john-sharratt Nov 21, 2023
cbfd167
Added unit tests for the journal and fixed the rkyv serializer
john-sharratt Nov 22, 2023
41f4a92
cargo update
john-sharratt Nov 22, 2023
09efd35
Added the compacting journal with some basic test cases
john-sharratt Nov 22, 2023
c73b5a8
Fixed some linting issues
john-sharratt Nov 22, 2023
d4fdbc5
Another fix for the linting
john-sharratt Nov 22, 2023
787f9e9
More linting fixes and additional tests
john-sharratt Nov 22, 2023
e4fc594
Finished the compactor and added CLI
john-sharratt Nov 23, 2023
abe6691
Merge remote-tracking branch 'origin/master' into dcgi
john-sharratt Nov 23, 2023
0f1a9fa
Added a couple more unit tests
john-sharratt Nov 23, 2023
38d74be
Some linting and compile fixes
john-sharratt Nov 23, 2023
0d728c3
Fix for lints
john-sharratt Nov 23, 2023
7875068
Another linting fix
john-sharratt Nov 23, 2023
bce86dd
Added better error handling that reduces a panic
john-sharratt Nov 23, 2023
b6eb4f6
Fixed a runtime in runtime bug on journal replays
john-sharratt Nov 23, 2023
4f6a193
Missing rewinds no long cause the process to fail
john-sharratt Nov 23, 2023
74edba8
Added additional method for rewinding
john-sharratt Nov 23, 2023
d1d5b8e
Removed the journals parameter from the API
john-sharratt Nov 23, 2023
cc95b87
Fixed the process exit record replay
john-sharratt Nov 23, 2023
10e9c02
Fixed a bug where the journals were not being returned by the runtime
john-sharratt Nov 23, 2023
cc97e1b
Fixed the bootstrapping process for one of the execution paths
john-sharratt Nov 23, 2023
0320f04
Added another bootstrap and cleaned up the an interface
john-sharratt Nov 23, 2023
36cb053
The journals are now properly being restored
john-sharratt Nov 23, 2023
953a0af
Fixed the linting and compile error
john-sharratt Nov 23, 2023
4baa50a
Added safety to the journal restore code
john-sharratt Nov 23, 2023
fc19f5f
Linting fixes
john-sharratt Nov 23, 2023
4abc215
More compile fixes
john-sharratt Nov 24, 2023
52ee3ea
Module mismatch is now a warning
john-sharratt Nov 24, 2023
ec617db
We now restore the TTY state after the program executes
john-sharratt Nov 26, 2023
1fea8d7
Fixed an issue where the restoration of the journal was not properly …
john-sharratt Nov 26, 2023
489f82c
Fixed a linting issue
john-sharratt Nov 26, 2023
3eaabd9
The TTY is now delayed during the journal bootstrap process
john-sharratt Nov 26, 2023
e1c2570
Fixed another linting issue
john-sharratt Nov 26, 2023
88c5d8c
Fixed the padding on the change directory event
john-sharratt Nov 26, 2023
46f1503
Removed the pop journals call which is not needed
john-sharratt Nov 26, 2023
f62f568
Fixed a bug where the offsets were not updating properly on journal w…
john-sharratt Nov 27, 2023
6ab87f9
Fixed a bug where the module hash plumbing was not correct
john-sharratt Nov 27, 2023
fd4f55e
Implemented fixes so the journal restoration works properly
john-sharratt Nov 27, 2023
69bcc61
The stdout and stderr are no longer restored if the process exits
john-sharratt Nov 27, 2023
d494348
Changes made after review with Christoph plus added compression of jo…
john-sharratt Nov 27, 2023
a41c996
Added some linting fixes and fix for compile erors
john-sharratt Nov 27, 2023
7d79610
Merge remote-tracking branch 'origin/master' into dcgi
john-sharratt Dec 20, 2023
dd74d08
Added the basics of a DCGI runner
john-sharratt Dec 21, 2023
aba3c65
Finished off the basics of DCGI
john-sharratt Dec 21, 2023
355fcfa
Now opening the preopens when the WASM environment is reinitialized
john-sharratt Dec 21, 2023
a06789d
Fixed an issue where the callbacks for DCGI were not be wired properly
john-sharratt Dec 21, 2023
8b9b8f6
Now properly propogating the errors to the caller
john-sharratt Dec 21, 2023
9e3f3c0
Now reusing DCGI instances but not the memory associated with them
john-sharratt Dec 22, 2023
ffb45ea
DCGI can not reuse memory for now due to several restrictions
john-sharratt Dec 22, 2023
25a4824
Added a fix so that stderr is properly returned on DCGI calls
john-sharratt Dec 22, 2023
0174592
Merge remote-tracking branch 'origin/master' into dcgi
john-sharratt Dec 23, 2023
e732ff8
Fixes after the latest merge
john-sharratt Dec 23, 2023
9caf4bf
Lint fixes and compile fixes
john-sharratt Dec 23, 2023
472d918
More linting and compile fixes
john-sharratt Dec 23, 2023
79f4d26
Added support for journals to DCGI and removed sharding
john-sharratt Dec 23, 2023
b4131b0
Fixed the journal wiring to DCGI
john-sharratt Dec 23, 2023
ca401d8
Compacting a journal will now remove more redundant file system event…
john-sharratt Dec 23, 2023
515aa16
Fixed an issue where the compacting logs were not rotating and compac…
john-sharratt Dec 23, 2023
803529a
Linting and compile fixes
john-sharratt Dec 23, 2023
b2b528e
Added more unit test fixes, linting and compile errors
john-sharratt Dec 23, 2023
0a32ca7
Merge remote-tracking branch 'origin/master' into dcgi
john-sharratt Dec 23, 2023
4bd2b9d
Linting fix
john-sharratt Dec 23, 2023
9bb6f18
Fixed a race condition in the virtual-net unit tests
john-sharratt Dec 23, 2023
4420e61
Excluding tokio from the docs build
john-sharratt Dec 23, 2023
1d5e5ec
Merge branch 'master' into dcgi
john-sharratt Jan 3, 2024
eb96a9b
Added comments on the JournalEffector
john-sharratt Jan 3, 2024
fb6d4d6
Split out the config responsibility from the FilteredJournal
john-sharratt Jan 3, 2024
58a8941
Added a comment on the excluding of tokio from the docs build
john-sharratt Jan 3, 2024
7c1b68d
Moved most of the journal functionality out into its own crate
john-sharratt Jan 3, 2024
21a23ef
Fixed some linting and compile issues from last refactor
john-sharratt Jan 4, 2024
77bb3cc
More linting fixes
john-sharratt Jan 4, 2024
d80d469
No longer re-exporting standard types in virtual-net
john-sharratt Jan 4, 2024
b6ef7dd
Removed the wasix and virtual-net dependencies from the archived events
john-sharratt Jan 4, 2024
4a22ad5
Merge remote-tracking branch 'origin/master' into dcgi
john-sharratt Jan 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@
"editor.formatOnSave": true,
"editor.formatOnPaste": false,
"editor.formatOnType": false
}
},
"rust-analyzer.showUnlinkedFileNotification": false
}
60 changes: 58 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -439,10 +439,10 @@ test-build-docs-rs:
fi; \
printf "*** Building doc for package with manifest $$manifest_path ***\n\n"; \
if [ -z "$$features" ]; then \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --locked || exit 1; \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --exclude tokio --locked || exit 1; \
john-sharratt marked this conversation as resolved.
Show resolved Hide resolved
else \
printf "Following features are inferred from Cargo.toml: $$features\n\n\n"; \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --features "$$features" --locked || exit 1; \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --exclude tokio --features "$$features" --locked || exit 1; \
fi; \
done

Expand All @@ -457,10 +457,10 @@ test-build-docs-rs-ci:
fi; \
printf "*** Building doc for package with manifest $$manifest_path ***\n\n"; \
if [ -z "$$features" ]; then \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly-2023-05-25 doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --locked || exit 1; \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly-2023-05-25 doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --no-deps --locked || exit 1; \
else \
printf "Following features are inferred from Cargo.toml: $$features\n\n\n"; \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly-2023-05-25 doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --features "$$features" --locked || exit 1; \
RUSTDOCFLAGS="--cfg=docsrs" $(CARGO_BINARY) +nightly-2023-05-25 doc $(CARGO_TARGET_FLAG) --manifest-path "$$manifest_path" --no-deps --features "$$features" --locked || exit 1; \
fi; \
done

Expand Down
148 changes: 148 additions & 0 deletions docs/journal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# WASM Journal Functionality

Wasmer now supports journals for the state of a WASM process. This gives the ability
to persist changes made to the temporary file system and to save and store snapshots
of the running process.

The journal file is a linear history of events that occurred when the process was
running that if replayed will bring the process made to a discrete and deterministic
state.

Journal files can be concatenated, compacted and filtered to change the discrete state.

These journals are maintained in a consistent and durable way thus ensuring that
failures of the system while the process is running does not corrupt the journal.

# Snapshot Triggers

The journal will record state changes to the sandbox built around the WASM process as
it runs however it may be important to certain use-cases to take explicit snapshot
restoration points in the journal at key points that make sense.

When a snapshot is triggered all the running threads of the process are paused and
the state of the WASM memory and thread stacks are recorded into the journal so that
they can be restored.

In order to use the snapshot functionality the WASM process must be compiled with
the `asyncify` modifications, this can be done using the `wasm-opt` tool.

Note: If a process does not have the `asyncify` modifications you can still use
the journal functionality for recording the file system and WASM memory state
however the stacks of the threads will be omitted meaning a restoration will
restart the main thread.

Various triggers are possible that will cause a snapshot to be taken at a specific
point in time, these are as follows:

## On Idle

Triggered when all the threads in the process go into an idle state. This trigger
is useful to take snapshots at convenient moments without causing unnecessary overhead.

For processes that have TTY/STDIN input this is particularly useful.

## On FirstListen

Triggered when a listen syscall is invoked on a socket. This can be an important
milestone to take a snapshot when one wants to speed up the boot time of a WASM process
up to the moment where it is ready to accept requests.

## On FirstStdin

Triggered when the process reads stdin for the first time. This can be useful to
speed up the boot time of a WASM process.

## On FirstEnviron

Triggered when the process reads an environment variable for the first time. This can
be useful to speed up the boot time of a CGI WASM process which reads the environment
variables to parse the request that it must execute.

## On Timer Interval

Triggered periodically based on a timer (default 10 seconds) which can be specified
using the `journal-interval` option. This can be useful for asynchronous replication
of a WASM process from one machine to another with a particular lag latency.

## On Sigint (Ctrl+C)

Issued if the user sends an interrupt signal (Ctrl + C).

## On Sigalrm

Alarm clock signal (used for timers)
(see `man alarm`)

## On Sigtstp

The SIGTSTP signal is sent to a process by its controlling terminal to request it to stop
temporarily. It is commonly initiated by the user pressing Ctrl-Z.

# On Sigstop

The SIGSTOP signal instructs the operating system to stop a process for later resumption

# On Non Deterministic Call

When a non-deterministic call is made from WASM process to the outside world (i.e. it reaches
out of the sandbox)

# Limitations

- The WASM process that wish to record the state of the threads must have had the `asyncify`
post processing step applied to the binary (see `wasm-opt`).
- Taking a snapshot can consume large amounts of memory while its processing.
- Snapshots are not instant and have overhead when generating.
- The layout of the memory must be known by the runtime in order to take snapshots.

# Design

On startup if the restore journal file is specified then the runtime will restore the
state of the WASM process by reading and processing the log entries in the snapshot
journal. This restoration will bring the memory and the thread stacks back to a previous
point in time and then resume all the threads.

When a trigger occurs a new journal will be taken of the WASM process which will
take the following steps:

1. Pause all threads
2. Capture the stack of each thread
3. Write the thread state to the journal
4. Write the memory (excluding stacks) to the journal
5. Resume execution.

The implementation is currently able to save and restore the following:

- WASM Memory
- Stack memory
- Call stack
- Open sockets
- Open files
- Terminal text

# Journal Capturer Implementations

## Log File Journal

Writes the log events to a linear log file on the local file system
as they are received by the trait. Log files can be concatenated
together to make larger log files.

## Unsupported Journal

The default implementation does not support snapshots and will error
out if an attempt is made to send it events. Using the unsupported
capturer as a restoration point will restore nothing but will not
error out.

## Compacting Journal

Deduplicates memory and stacks to reduce the number of volume of
log events sent to its inner capturer. Compacting the events occurs
in line as the events are generated

## Filtered Journal

Filters out a specific set of log events and drops the rest, this
capturer can be useful for restoring to a previous call point but
retaining the memory changes (e.g. WCGI runner).
2 changes: 1 addition & 1 deletion examples/wasi_manual_setup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let start = instance.exports.get_function("_start")?;
start.call(&mut store, &[])?;

wasi_env.cleanup(&mut store, None);
wasi_env.on_exit(&mut store, None);

Ok(())
}
Expand Down
4 changes: 2 additions & 2 deletions lib/api/src/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ use wasmer_types::ImportError;
/// This is based on the [link error][link-error] API.
///
/// [link-error]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly/LinkError
#[derive(Debug)]
#[derive(Debug, Clone)]
#[cfg_attr(feature = "std", derive(Error))]
#[cfg_attr(feature = "std", error("Link error: {0}"))]
pub enum LinkError {
Expand All @@ -41,7 +41,7 @@ pub enum LinkError {
/// Trap that occurs when calling the WebAssembly module
/// start function, and an error when initializing the user's
/// host environments.
#[derive(Debug)]
#[derive(Debug, Clone)]
#[cfg_attr(feature = "std", derive(Error))]
pub enum InstantiationError {
/// A linking ocurred during instantiation.
Expand Down
2 changes: 1 addition & 1 deletion lib/api/src/exports.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ use thiserror::Error;
/// // This results with an error: `ExportError::Missing`.
/// let export = instance.exports.get_function("unknown").unwrap();
/// ```
#[derive(Error, Debug)]
#[derive(Error, Debug, Clone)]
pub enum ExportError {
/// An error than occurs when the exported type and the expected type
/// are incompatible.
Expand Down
16 changes: 16 additions & 0 deletions lib/api/src/externals/memory.rs
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,22 @@ impl Memory {
self.0.grow(store, delta)
}

/// Grows the memory to at least a minimum size. If the memory is already big enough
/// for the min size then this function does nothing
pub fn grow_at_least(
&self,
store: &mut impl AsStoreMut,
min_size: u64,
) -> Result<(), MemoryError> {
self.0.grow_at_least(store, min_size)
}

/// Resets the memory back to zero length
pub fn reset(&self, store: &mut impl AsStoreMut) -> Result<(), MemoryError> {
self.0.reset(store)?;
Ok(())
}

/// Attempts to duplicate this memory (if its clonable) in a new store
/// (copied memory)
pub fn copy_to_store(
Expand Down
Loading
Loading