Is it possible to avoid syscalls during Function::call? #1922

ul · 2020-12-12T05:03:25Z

Summary

This question is in the context of using WASM for realtime audio synthesis. I invoke Function::call within an audio thread 48000 times per second. In the profiler, sigprogmask and sigaltstack get a noticeable share of CPU time (see attached screenshot). While overhead itself is more or less acceptable as it stays constant with the growth of the executed WASM code, the unpredictability of the latency incurred by syscalls is bothersome for my use-case, I get a high variation in the trace with some calls taking longer than my per-frame budget. So my question is as follows: is it possible to implement Function::call without the need for syscalls?

Additional details

The text was updated successfully, but these errors were encountered:

syrusakbary · 2020-12-12T06:39:41Z

As functions can fail, we need to wrap them so if they fail we catch the traps properly.
However, this "catch" mechanism is done each time per function call and can indeed be optimized.

A few months ago we were thinking on creating an unsafe function call_unchecked so the user can catch errors themselves manually (rather than having Wasmer do it for them).
In your use case that means that you will choose how to catch the error. And you could save the 918ms spent on the traphandlers code.

How this would be implemented?

We would need a function similar to:

// Similar to wasmer_call_trampoline, but without catch_traps
pub unsafe fn wasmer_call_trampoline_unchecked(
    vmctx: VMFunctionEnvironment,
    trampoline: VMTrampoline,
    callee: *const VMFunctionBody,
    values_vec: *mut u8,
) -> Result<(), Trap> {
    mem::transmute::<_, extern "C" fn(VMFunctionEnvironment, *const VMFunctionBody, *mut u8)>(
        trampoline,
    )(vmctx, callee, values_vec)
}

And then the user would need to do something similar to:

catch_traps(vmctx, || {
    for i in 1..48000 {
        unsafe {
            func.call_unchecked(...);
        }
    }
}

This way, catch_traps will only be called once rather than 48,000 times and our code will still be safe to execute (returning any error when it happens.

ul · 2020-12-12T06:45:57Z

Thank you for the explanation @syrusakbary! Indeed it would be nice to have such flexible API for traps to allow for some optimisations in cases like mine.

syrusakbary · 2021-02-09T01:37:41Z

Hi @ul,

I just created #2102. It should address your main question as it makes Function calls an order of magnitude faster.

2102: Use platform setjmp/longjmp to optimize function calls r=syrusakbary a=syrusakbary  # Description Use platform setjmp/longjmp when possible to optimize function calls. This PR fixes #1922. This improves timings from: ``` Benchmarking basic static func llvm: Collecting 100 samples in estimated 5.0004 basic static func llvm time: [131.99 ns 132.31 ns 132.66 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild Benchmarking basic static func with many args llvm: Collecting 100 samples in es basic static func with many args llvm time: [140.14 ns 140.55 ns 140.94 ns] Benchmarking basic static func cranelift: Collecting 100 samples in estimated 5. basic static func cranelift time: [133.51 ns 133.81 ns 134.09 ns] Found 13 outliers among 100 measurements (13.00%) 7 (7.00%) high mild 6 (6.00%) high severe Benchmarking basic static func with many args cranelift: Collecting 100 samples basic static func with many args cranelift time: [144.17 ns 145.04 ns 146.01 ns] Found 7 outliers among 100 measurements (7.00%) 4 (4.00%) high mild 3 (3.00%) high severe Benchmarking basic dynfunc llvm: Collecting 100 samples in estimated 5.0012 s (2 basic dynfunc llvm time: [228.77 ns 229.59 ns 230.35 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe Benchmarking basic dynfunc with many args llvm: Collecting 100 samples in estima basic dynfunc with many args llvm time: [277.33 ns 279.09 ns 280.91 ns] Benchmarking basic dynfunc cranelift: Collecting 100 samples in estimated 5.0008 basic dynfunc cranelift time: [229.38 ns 230.38 ns 231.43 ns] Found 15 outliers among 100 measurements (15.00%) 14 (14.00%) high mild 1 (1.00%) high severe Benchmarking basic dynfunc with many args cranelift: Collecting 100 samples in e basic dynfunc with many args cranelift time: [278.24 ns 280.11 ns 281.96 ns] ``` To: ``` Benchmarking basic static func llvm: Collecting 100 samples in estimated 5.0001 basic static func llvm time: [19.791 ns 19.817 ns 19.845 ns] change: [-85.086% -85.045% -85.006%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe Benchmarking basic static func with many args llvm: Collecting 100 samples in es basic static func with many args llvm time: [29.684 ns 29.716 ns 29.756 ns] change: [-78.858% -78.802% -78.743%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 1 (1.00%) low severe 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking basic static func cranelift: Collecting 100 samples in estimated 5. basic static func cranelift time: [22.266 ns 22.289 ns 22.316 ns] change: [-83.476% -83.279% -82.980%] (p = 0.00 < 0.05) Performance has improved. Found 15 outliers among 100 measurements (15.00%) 8 (8.00%) high mild 7 (7.00%) high severe Benchmarking basic static func with many args cranelift: Collecting 100 samples basic static func with many args cranelift time: [30.699 ns 30.726 ns 30.757 ns] change: [-78.786% -78.682% -78.586%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe Benchmarking basic dynfunc llvm: Collecting 100 samples in estimated 5.0005 s (4 basic dynfunc llvm time: [120.06 ns 121.13 ns 122.21 ns] change: [-47.266% -46.814% -46.367%] (p = 0.00 < 0.05) Performance has improved. Benchmarking basic dynfunc with many args llvm: Collecting 100 samples in estima basic dynfunc with many args llvm time: [172.60 ns 176.38 ns 181.35 ns] change: [-32.788% -27.622% -21.063%] (p = 0.00 < 0.05) Performance has improved. Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking basic dynfunc cranelift: Collecting 100 samples in estimated 5.0004 basic dynfunc cranelift time: [120.39 ns 121.71 ns 123.13 ns] change: [-46.541% -45.905% -45.238%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild Benchmarking basic dynfunc with many args cranelift: Collecting 100 samples in e basic dynfunc with many args cranelift time: [162.72 ns 163.36 ns 164.01 ns] change: [-41.999% -41.705% -41.419%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) ``` So, best case scenario from 131.99ns to 19.817ns.  # Review - [ ] Add a short description of the the change to the CHANGELOG.md file Co-authored-by: Syrus Akbary <[email protected]>

ul · 2021-02-09T09:58:19Z

Thank you, I appreciate it!

syrusakbary · 2021-10-20T13:18:50Z

Duplicate of #2562

ul added the ❓ question I've a question! label Dec 12, 2020

ul mentioned this issue Dec 15, 2020

Provide an API for unchecked wasm calls and setting trap handlers manually #1935

Closed

1 task

syrusakbary mentioned this issue Feb 9, 2021

feat(vm) Use sigsetjmp and siglongjmp when available #2102

Closed

1 task

syrusakbary mentioned this issue May 15, 2021

Added call unchecked to the functions #2319

Closed

Hywan added the 📦 lib-vm About wasmer-vm label Jul 16, 2021

syrusakbary marked this as a duplicate of #2562 Oct 20, 2021

syrusakbary closed this as completed Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to avoid syscalls during Function::call? #1922

Is it possible to avoid syscalls during Function::call? #1922

ul commented Dec 12, 2020

syrusakbary commented Dec 12, 2020

ul commented Dec 12, 2020

syrusakbary commented Feb 9, 2021

ul commented Feb 9, 2021

syrusakbary commented Oct 20, 2021

Is it possible to avoid syscalls during Function::call? #1922

Is it possible to avoid syscalls during Function::call? #1922

Comments

ul commented Dec 12, 2020

Summary

Additional details

syrusakbary commented Dec 12, 2020

ul commented Dec 12, 2020

syrusakbary commented Feb 9, 2021

ul commented Feb 9, 2021

syrusakbary commented Oct 20, 2021