Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ctx needs ability to invoke functions #638

Closed
autodidaddict opened this issue Aug 7, 2019 · 18 comments
Closed

Ctx needs ability to invoke functions #638

autodidaddict opened this issue Aug 7, 2019 · 18 comments
Labels
🎉 enhancement New feature!

Comments

@autodidaddict
Copy link

Motivation

My project has a wasmer runtime embedded in it, and my use case involves the host runtime invoking a guest module function. In response, the guest module can invoke multiple host functions. Within those host functions, I might need to make further guest function calls to interrogate data.

tl;dr host functions that are free-floating (not bound to a struct instance) have no way to invoke functions in the module, though they can read/write the module's memory block.

Proposed solution

Add the func call to the Ctx struct, or add a reference to the instance that invoked the host function to the signature of the function pointed at by the func! macro.

Alternatives

The only alternatives we have are trying to store global references to the Instance, but since the Instance triggers all kinds of thread safety errors as soon as we try and do that, it's not a viable option. Right now it appears that host functions cannot call guest functions, which severely inhibits some RPC-style call patterns.

Additional context

I'm looking at this functionality as a way to simplify the wascap interface standard to be more fluid and to avoid explicit cross-wasm boundary calls to allocate memory (decoupling allocation from the RPC interface). I can't do this if the host functions can't make guest function calls.

@MarkMcCaskey
Copy link
Contributor

I've created an example that does this, but unless I'm misunderstanding the spec, then there's an issue with the Wasm Rustc is generating rust-lang/rust#64521 .

The only change needed is for runtime-core::instance::call_func_with_index to be public.

I've confirmed that Wasm and host fn callbacks now work with all 3 backends.

The issue is that given what Rustc currently does, it's not possible to tell the difference between a host fn and a wasm fn. And not only that, but if the Rust code tries to do anything with a host-fn as a callback, it clobbers all the other indices making them invalid.

The way I'm doing it here is closer to #670, passing the functions directly. I agree that we should have other ways to do this like this issue mentions!

bors bot added a commit that referenced this issue Sep 19, 2019
803: Add method to call function at index on Ctx r=MarkMcCaskey a=MarkMcCaskey

For #638 and #670

```Rust
fn call_guest_fn(ctx: &mut Ctx, guest_fn: u32) -> u32 {
    println!("{}", guest_fn);

    let guest_fn_typed = unsafe { guest_fn };

    let result = ctx.call_with_table_index(guest_fn_typed, &[]).unwrap();
    println!("  -> {:?}", result);

    0
}
```
is what this looks like from the Host side


# Review

- [x] Create a short description of the the change in the CHANGELOG.md file


Co-authored-by: Mark McCaskey <[email protected]>
Co-authored-by: Mark McCaskey <[email protected]>
bors bot added a commit that referenced this issue Sep 19, 2019
803: Add method to call function at index on Ctx r=MarkMcCaskey a=MarkMcCaskey

For #638 and #670

```Rust
fn call_guest_fn(ctx: &mut Ctx, guest_fn: u32) -> u32 {
    println!("{}", guest_fn);

    let guest_fn_typed = unsafe { std::mem::transmute(guest_fn) };

    let result = ctx.call_with_table_index(guest_fn_typed, &[]).unwrap();
    println!("  -> {:?}", result);

    0
}
```
is what this looks like from the Host side

See `examples/callback.rs` for an example that doesn't require `transmute`


# Review

- [x] Create a short description of the the change in the CHANGELOG.md file


Co-authored-by: Mark McCaskey <[email protected]>
Co-authored-by: Mark McCaskey <[email protected]>
@MarkMcCaskey
Copy link
Contributor

Okay! Update on this front:

We have a PR out for making Instance Send (#807) which may resolve some of your issues, but it sounds like those might be a work around anyways. #803 merged which allows calling functions via the table. In order to get the table mapping, you'd have to get data from the Wasm side. Is this something that seems feasible for your use case? I think the biggest missing piece now that I think about it is that the change I made requires you to pass in arguments and doesn't let the caller query the signature of the function. If that's an issue, I can prioritize looking into that, too.

@thedavidmeister
Copy link

thedavidmeister commented Jan 30, 2020

this seems really fundamental to me

i'm trying to get strings out of the host into the guest from an imported host function call using wee_alloc

so i have a setup looking something like this:

pub fn import_object() -> ImportObject {
    imports! {
        "env" => {
            "__host_process_string" => func!(host_process_string),
        },
    }
}
fn host_process_string(ctx: &mut Ctx, ptr: i32, cap: i32) -> u64 {
    println!("hiii {} {}", ptr, cap);
    let guest_string = read_guest_string(ctx, ptr, cap);
    println!("guest_string {}", guest_string);
    let processed_string = format!("host: {}", guest_string);
    u64_merge_bits(processed_string.as_ptr() as _, processed_string.len() as _)
}

but i get memory out of bounds issues here, because the processed_string ptr and len point to the host ptr and len

so i need to write from the host to the guest and then tell the guest to go pick up a string from the guest's pointer on the return side

to work with wee_alloc i have this function to allow hosts to write to the guest:

pub fn write_guest_string(instance: &mut Instance, s: String) -> (i32, i32) {
    let guest_ptr = match instance
        .call("pre_alloc_string", &[Value::I32(s.len() as _)])
        .expect("run pre alloc")[0]
    {
        Value::I32(i) => i,
        _ => unreachable!(),
    };

    let memory = instance.context_mut().memory(0);

    for (byte, cell) in s
        .bytes()
        .zip(memory.view()[guest_ptr as _..(guest_ptr + s.len() as i32) as _].iter()) {
            cell.set(byte)
    }

    (guest_ptr, s.len() as i32)
}

with the "pre alloc" function inside the guest looking like this:

pub extern "C" fn pre_alloc_string(cap: u32) -> i32 {
    // https://doc.rust-lang.org/std/string/struct.String.html#examples-8
    // Prevent automatically dropping the String's data
    let dummy:Vec<u8> = Vec::with_capacity(cap as _);
    let ptr = dummy.as_slice().as_ptr() as i32;
    mem::ManuallyDrop::new(dummy);
    ptr
}

so a ptr gets calculated by wee_alloc inside the guest, and then wee_alloc forgets about it so it doesn't get cleaned up, so then the host can write to that, then on the guest side again we can pick it back up with:

pub fn host_string(ptr: u32, len: u32) -> String {
    let slice = unsafe { slice::from_raw_parts(ptr as _, len as _) };
    String::from(std::str::from_utf8(slice).unwrap())
}

my problem is that i need the instance to call the pre allocation function but inside the host_process_string imported function i only have access to a context

this means that AFAICS i cannot safely have the host return strings to the guest from imported functions

i feel like i must be missing something obvious here 😅

can someone ELI5 what the situation is for calling guest functions from inside imported host functions?

@autodidaddict
Copy link
Author

autodidaddict commented Jan 30, 2020 via email

@thedavidmeister
Copy link

i see https://github.com/confio/cosmwasm/commit/92d25c8f8e457a4895bc0a578afdbb579d49e567 as a solution in a referencing issue

it relies on the guest preallocating some memory that the host writes into, then the host returns the actual amount of memory allocated, which seems like a half solution that could work for some situations and not others:

  • guest has to allocate "enough" memory, which in many situations can't be known ahead of time so we'd have to shoot for "way more than we expect to need", which in wasm can mean new pages added that are never used and can never be removed
  • host could potentially need more than the guest allows, which would panic, the solution linked has a TODO for this situation

@autodidaddict
Copy link
Author

autodidaddict commented Jan 30, 2020 via email

@thedavidmeister
Copy link

@autodidaddict looks awesome! just confirming that wapc is a complete replacement/alternative for wasmer, not something that extends it?

if so, is there something in wapc that works like the closures that wasmer provides for its imports! and func! macros (capturing e.g. references to things at the moment of instantiating the wasm instance)?

i'm trying to work with atomic writes on the host so that subsequent/nested wasm calls that have the host write on behalf of a single guest making calls all succeed or fail together (the references captured by the closures are all to something handling the atomicity for me)

i'm hoping to both be able to capture references to something that co-ordinates writes (seems easy with wasmer) and be able to return data from the host from imported functions (seems easy with wapc)

@autodidaddict
Copy link
Author

autodidaddict commented Jan 30, 2020 via email

@thedavidmeister
Copy link

thedavidmeister commented Jan 30, 2020

ok, based on your blog post i think i know what to do https://medium.com/@KevinHoffman/introducing-wapc-dc9d8b0c2223

it is late here but tomorrow i think i'll try not having the imported host function call the instance again, instead it can return the pointer/length to the host version of the data as a u64 with high/low bits as u32 pointer/length

the guest could then use this to prepare its own pointer and length, and pass this back to the host alongside the hosts pointer and length, then the host can copy its data into the guests data

i think that's how wapc roughly works :) sounds good to try out

@thedavidmeister
Copy link

i managed to get something working with a two step process

u64 pointer -> [u64; 2] pointer+length -> data

because the intermediate pointer+length is always 2 u64 ints i know it is 16 bytes so can copy things between host/guest without ambiguity in a fairly standard way

@dsherret
Copy link
Contributor

dsherret commented Jul 4, 2020

It seems this is somewhat possible with the ctx.call_table_with_index function.

I found this example, but it's not so convenient as you can't just call a function by name: https://github.com/wasmerio/wasmer/pull/803/files#diff-e9732635192012c4363a1b003e040e5e

@mikevoronov
Copy link
Contributor

mikevoronov commented Jul 5, 2020

Also, you could use the following scheme:

  1. Export allocate/deallocate functions from Wasm like here
use std::alloc::alloc as global_alloc;
use std::alloc::dealloc as global_dealloc;
use std::alloc::Layout;

/// Allocates memory area of specified size and returns its address.
/// Returns 0 if supplied size is too long.
#[no_mangle]
pub unsafe fn allocate(size: usize) -> usize {
    let layout = match Layout::from_size_align(size, std::mem::align_of::<u8>()) {
        Ok(layout) => layout,
        // in this case a err may occur only in a case of too long allocated size,
        // so just return 0
        Err(_) => return 0,
    };

    global_alloc(layout) as _
}

/// Deallocates memory area for provided memory pointer and size.
/// Does nothing if supplied size is too long.
#[no_mangle]
pub unsafe fn deallocate(ptr: *mut u8, size: usize) {
    let layout = match Layout::from_size_align(size, std::mem::align_of::<u8>()) {
        Ok(layout) => layout,
        // in this case a err may occur only in a case of too long allocated size,
        // so just done nothing
        Err(_) => return,
    };

    global_dealloc(ptr, layout);
}
  1. In import closure it is needed to find these functions by name (I am using this approch for that).

  2. Then you could call allocate to allocate enough memory for a supplied to a guest string.

  3. After that, it is needed to write this string into guest memory by any way (I am using this one).

  4. At the final step, the thing may be different. You need to supply pointer and size to guest, that could be done in different ways. At now, I use the approach with 2 global variables that contains pointer and size of a string and then call String::from_raw_parts, my Wasm modules export two function to work with them. Overall, such scheme is realized here.

@thedavidmeister
Copy link

@michaelvoronov i'm doing something similar now, but using Vec<u8> rather than strings because i'm dealing with data serialized in messagepack that needs binary data that doesn't work with strings - https://github.com/holochain/holochain-wasmer

@dsherret thanks for the link to the example of nested calls :)

@mikevoronov
Copy link
Contributor

mikevoronov commented Jul 20, 2020

Yes, String is just a form of wrapper for Vec<u8> in Rust :), so this approach could be also applied to it. Almost all would be similar, but instead of String::from_raw_parts Vec::from_raw_parts should be used.

Thank you for the link to your project, will take a look.

@mikevoronov
Copy link
Contributor

Also, it seems that in the refactored Wasmer the 2nd step would be simplified by providing a special API, and 5th step could be rewritten with multi-value.

@thedavidmeister
Copy link

thedavidmeister commented Jul 20, 2020

@michaelvoronov also instead of using globals to track things i'm always having the guest functions just return/receive pointers to memory in the guest, and i prefix the Vec<u8> with a u32 as 4 bytes to encode its length - similar to, but a simplified version of how messagepack works

i haven't seen the new wasmer, maybe multi value would allow ptr/len to be represented directly :)

@mikevoronov
Copy link
Contributor

mikevoronov commented Jul 20, 2020

Heh, I've used this approach in our previous sdk, but now with support of interface-types I decided to use these globals, because from interface-types side I can't decode/encode Vec and extract size. From IT side it looks like this.

So, I am super waiting for the new Wasmer with multi-value support to refactor this messy approach with global variables.

Regarding access to new Wasmer - it seems you can join to the Wasmer slack and contact Syrus (based on this) for granting access.

@syrusakbary
Copy link
Member

We completely changed the way Context works in upcoming Wasmer 1.0.

@MarkMcCaskey is also working on the PR #1739 which will make the implementation trivial.

Closing the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🎉 enhancement New feature!
Projects
None yet
Development

No branches or pull requests

6 participants