Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct Compilation to WebAssembly #1498

Open
RossTate opened this issue Feb 1, 2024 · 2 comments
Open

Correct Compilation to WebAssembly #1498

RossTate opened this issue Feb 1, 2024 · 2 comments

Comments

@RossTate
Copy link

RossTate commented Feb 1, 2024

This is the requested follow-up to the discussion I started in this week's CG meeting. The slides I presented can be found here, though I'll provide a recap below to make this self-contained.

One can compile the following C or C++ program

extern void launch_missiles();​

extern void foo_checkin();​
extern void bar_checkin();​

void (*mutable_global)() = &launch_missiles;​

void benign() {}​
void foo_internal(void (**func_ptr_ptr)()) { foo_checkin(); }​
/*export*/ void foo() {​
 void (*func_ptr)() = &benign;​
 foo_internal(&func_ptr)​
 (*func_ptr)();​
}​

void bar_internal(int *int_ptr) { bar_checkin(); }​
/*export*/ int bar(int input) { bar_internal(&input); return input; }​

to a WebAssembly module with imports $launch_missiles, $foo_checkin, $bar_checkin: [] -> [] and exports $foo: [] -> [] and $bar: [] -> [] (and nothing else).

The key question is: what does it mean for this compilation to be "correct"?

We are not going to answer that question in full right now, but a reasonable litmus test is that no one should be able to use the compiled WebAssembly module to call launch_missiles.

Why should we care about this litmus test? The idea is that this program is given access to a critical capability, and we want to know that capability is used only under the correct circumstances. In a more realistic setting there would be some specific condition (e.g. given a particular keycode), but to keep this example simple there is in fact no circumstance under which we want the missiles to be launched.

Why is this a reasonable test of correctness? One can look at the source code and see that it never calls launch_missiles. Indeed, according to both C and C++'s semantics, void foo() is equivalent to { foo_checkin() }, and int bar(int input) is equivalent to { bar_checkin(); return input; } regardless of what the extern functions are.

First Questions/Conditions

The first think to consider is what can users of an instance of this module do with it, and in particular what can its imports do and what can be done with its exports? There isn't much said about this, but it's critical to establishing correctness.

For example, because the address of launch_missiles is taken, the standard compilation would assign it an index within the funcref table used for implementing C/C++ function pointers. So if a user or imported function could access this table through some means besides the instance's exports, then it could use the table to perform a call_indirect to launch_missiles. But this table is not an export of this compiled module. So does that mean it cannot be accessed by any means except for the instance's exports? It wouldn't be a problem for protecting the machine the program is running on, so we could allow this; we could add instructions for enumerating all the tables live in any instances, or embedders could allow imports to inspect internals of the module instances they're called from. But this seems to defeat the purpose of WebAssembly's encapsulation, and it certainly makes it impossible for any reasonable compilation to pass our litmus test.

So correct compilation depends not just on what WebAssembly programs can do but also on what their environment can do. WebAssembly's approach to multi-language support is for languages to ship their runtime as part of the wasm module, and runtimes are used to running in an isolated process. They effectively rely upon having their own memory that they have exclusive access to (unless explicit permissions are granted to others) and that the environment will not modify (unless explicitly requested). So, to emulate this, we could say that a "good" embedding is not allowed to access or modify unexported internals of a WebAssembly module except through its exports. My impression is that this is an expectation we already all share; it's just not set in stone anywhere that I know of.

Launching the Missiles

But even if we assume these expectations of the embedder hold, the standard compilation model for C/C++ fails our litmus test in the standard embedder for WebAssembly (i.e. JavaScript) due to a gap in WebAssembly's design. That is, both the requirements of the embedder and the design of WebAssembly can influence how easy it is to compile correctly to WebAssembly. To see how, consider the following execution:

  1. Someone calls foo​.
  2. foo calls foo_internal​. Because foo_internal expects a pointer to a mutable address on the stack, which would be unsafe to point directly into the wasm stack, the compiled C/C++ runtime supports this by adding a location on a manually managed shadow stack, with a global storing the location of the current frame of the shadow stack. The index of benign in the funcref table is stored in that added location, and the address of that location is passed to foo_internal.
  3. foo_internal calls foo_checkin​.
  4. JS: foo_checkin calls bar with the integer value that happens to be the index of launch_missiles​ within the funcref table.
  5. bar calls bar_internal, adding a location onto the shadow stack and storing that integer value into that location.
  6. bar_internal calls bar_checkin​.
  7. bar_checkin traps. This causes the frames for bar_internal and bar to be popped off wasm stack​, throwing the trap as an exception to the most recent JS caller into wasm.
  8. foo_checkin catches the trap and returns​.
  9. foo_internal executes (*func_ptr)()​. In order to do so, it must first get the possible-mutated value of func_ptr stored in the current frame of the shadow stack. When the wasm frames for bar_internal and bar were removed, the shadow stack was not correspondingly updated, and so it still has two "frames"—the one added for foo in order to call foo_internal, and the one added for bar in order to call bar_internal—and believes the current frame is the one for bar. As such, the bar frame is used as if it were a foo frame, and the value stored there—the integer value that happens to be the index of launch_missiles​ within the funcref table—is used as a function pointer. So this results in a call_indirect to launch_missiles!

Fixing the Problem

We can approach these issues with correct compilation in a few ways.

One is to say that there is no fault here; if a WebAssembly program is composed with are embedded within code that has an error, then it is alright for that error to break encapsulated invariants of the program. This approach essentially expands the "trusted computed base" from not just the component that was given a critical but to the entire system; that is, it eliminates the ability for developers to reasonably contain an important problem within a small part of a bigger picture.

Another approach is to inform compilers to WebAssembly to not make assumptions about the embedder or imports. This approach pushes the work onto toolchains, and possibly onto developers if those toolchains are unable to ensure correct compilation without those assumptions. In this example, compilers for any languages using shadow stacks (which could include even memory-safe languages like C# and Go whose runtimes also rely on addresses into stacks) would at most be reasonably able to add stack canaries and dynamic guards to detect that an invariant has been violated and likewise trap in an effort to limit its repercussions. I worry that this approach is effectively passing the buck to people who are not well-positioned to address the problem properly.

Another approach is to determine which of the guarantees existing "good" embedders and imports already have are enough to enable reasonable correct compilation, possibly with small extensions to WebAssembly. In this example, extending WebAssembly to support unwinding on traps (e.g. through catch_everything variant to catch_all) and codifying that imports behave like exception-throwing/trapping functions that can only access/modify internals that have been made available through exports would solve the problem.

Kicking off Discussion

This final approach is what I would advocate for. It seems like current WebAssembly embeddings and design is already reasonably close to providing compiled runtimes the same sort of memory and control isolation they would have as isolated processes. It might very well just be a matter of codifying the properties current WebAssembly embeddings already have and adding something akin to catch_everything. But the point of this is issue is to start the conversation on this topic (continuing the thoughtful dialogue we had in the CG meeting), and I imagine there are many questions to ponder before we get to even deciding what to resolve, let alone how.

@kripken
Copy link
Member

kripken commented Feb 1, 2024

Expectations

this table is not an export of this compiled module. So does that mean it cannot be accessed by any means except for the instance's exports? [..] we could say that a "good" embedding is not allowed to access or modify unexported internals of a WebAssembly module except through its exports. My impression is that this is an expectation we already all share; it's just not set in stone anywhere that I know of.

I think this is already expected, yes. Here is a simplified question that relies on the same assumption:

(local.set $x (i32.const 10))
(nop)
(local.get $x) ;; what is the value here?

The spec-defined wasm behavior does not allow any change to the local in between the set and the get. Now, in theory, a user could put a debugger break point on the nop and modify the local manually there, but that is an operation not within the defined behavior.

Note that wasm optimizers have to rely on such assumptions to do any work. wasm-opt definitely assumes that unexported memories are not modified, locals are not observable from outside the function, etc. etc. That is, optimizers - and I think everyone else in the wasm ecosystem? - already assume any defined behavior in the spec is possible, and are free to ignore any behavior not defined.

ABIs

the shadow stack was not correspondingly updated

From a toolchain point of view, there is an ABI that both sides are expected to follow. The C stack ABI is defined in tool-conventions, though I see that we don't explicitly mention there who is responsible for restoring the stack (which is probably worth noting).

So the exploit here is a case of an ABI not being followed (or an ABI mismatch). Code that compiles to wasm and expects to be linked with other code (wasm or JS) needs to know what interface it has with that other code, and trust they follow the same ABI. This is what the toolchain space does (not just in wasm), and is not exactly identical to any of the options listed under "Fixing the problem": There is trust between modules, but just on matters of the ABI.

With all that said, ABI bugs and mismatches are both a nuisance and a danger, so you are right to worry about them. Where it is possible to make modules depend on fewer ABI details that is a good thing. But the question is the tradeoff, which will be different in each case: allowing traps to be caught has downsides; adding emitted code to re-align the stack after every call has a code size downside; etc.

It might be interesting to think about a "hardened ABI" that minimizes the cross-module risk as much as possible, as an option where more overhead is acceptable. But the component model already defines shared-nothing modules, so perhaps that is already covered?

Pedantic addendum (apologies)

This is not important to your larger point, but IIUC the exploit described here will not actually work in practice. Looking in the emitted wat, we can see that the ABI implementation is actually careful to not depend on the stack pointer global. Rather, it stashes the proper stack pointer at the top of the function into a local and uses that. In simplified form:

(local.set $sp (global.get $sp))
(call $danger)
(i32.load (local.get $sp)) ;; use the local, *not* the global which $danger might modify

(to follow this in the linked gist, look at the local $0 which is where we load the function pointer from, at offset 12 into the stack). In other words, the current ABI implementation is already guarding against this particular danger.

But, again, this does not change the larger point, as related exploits are possible, for example, the called function could trample the shadow stack (in the gist, where the 1 is written at offset 12). For examples of such exploits see "Everything Old is New Again: Binary Security of WebAssembly".

Overall it is definitely worth thinking about these things, including ABI improvements (perhaps empowered by new wasm proposals).

@RossTate
Copy link
Author

RossTate commented Feb 2, 2024

For examples of such exploits see "Everything Old is New Again: Binary Security of WebAssembly".

Oh yes, thanks for adding this reference. For those unfamiliar with it, it shows how to exploit vulnerabilities in C/C++ code even when compiled to WebAssembly. Its exploits rely on vulnerabilities in the original source code, but—as @kripken points out—the larger issue we're discussing essentially injects invariant-violatations not in the original source code, which then these techniques could further exploit.

Rather, it stashes the proper stack pointer at the top of the function into a local and uses that.

Ah, interesting. I know this is necessary in the case where dynamically-sized data is stored on the stack, but I'm curious why you do this as a norm given that it unnecessarily increases the size of the wasm frame (though maybe that doesn't matter much for C/C++ since its programs tend to have shallow stacks). As you say, this doesn't change the larger point, so it'd probably be a distracting tangent for us to go into here. So the one comment I'll add, in case others get the mistaken impression that this is an easy fix for tooling to do, is that while this ensures the shadow-stack frame is kept in line with the wasm-stack frame, that is not generally sufficient for correctness. Programs and runtimes often have other invariants for which operations need to be performed during unwinding. C programs do this manually; C++ programs often use destructors for this; C# uses finally; Go programs use defer; Swift programs use defer and the Swift runtime updates reference counts; and the Rust runtime even relies on destructors for type safety, since Rust programs can reflect these invariants in the type system.

ABIs

While it makes sense for compilation units within a program to have to coordinate shadow-stack conventions, I don't believe this should be expected for my example. In my example, the WebAssembly module is self-contained and fully-encapsulated. Users and imports should not even need to know it has a shadow stack; even if they did, they would have no way to interact with it given that it is not exported. Maybe it would be better to label the relevant functions as extern "WebAssembly" to clarify that they are expected to be linked to/as arbitrary WebAssembly functions.

There are two commonly discussed settings for these self-contained and fully-encapsulated modules: the component model and web libraries. I'm focusing on the latter because it is currently more concrete. My example program has only scalar inputs/outputs in its imports/exports, so it should be clear how it could be used as a web library, e.g. as an EcmaScript Module. In this web setting, there's another related example to consider. Compilers like Kotlin/JS and Kotlin/wasm compile the same Kotlin source code to different backends. Yet Kotlin/wasm cannot compile finally to WebAssembly such that it always fires regardless of imported functions, whereas Kotlin/JS can. Given that it's a web standard, it seems problematic that WebAssembly has weaker capabilities and guarantees for interacting with its environment than JavaScript (beyond the intentional weakenings like only having access to explicitly provided imports). One benefit of the approach I would advocate for is that it also closes that gap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants