From 17f5fffa534c78ece186cb878ba0811087c5285e Mon Sep 17 00:00:00 2001 From: Nick Fitzgerald Date: Sat, 9 Sep 2023 10:26:45 -0700 Subject: [PATCH 01/14] Add component call micro-benchmarks (#6981) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This commit adds the component equivalents of the existing core Wasm call micro-benchmarks. This also adds a sprinkling of `#[inline]` to some functions that I noticed when glancing at some profiles. The two most important numbers: ``` sync/no-hook/component - host-to-wasm - typed - nop time: [75.849 ns 76.476 ns 77.247 ns] sync/no-hook/component - wasm-to-host - typed - nop time: [33.614 ns 33.872 ns 34.170 ns] ``` The full benchmark results are in here:
``` $ cargo bench --features component-model --bench call 'component' Finished bench [optimized] target(s) in 0.19s Running benches/call.rs (target/release/deps/call-4d8d1585dd2825a2) sync/no-hook/component - host-to-wasm - typed - nop time: [75.849 ns 76.476 ns 77.247 ns] Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) high mild 4 (4.00%) high severe sync/no-hook/component - host-to-wasm - untyped - nop time: [108.29 ns 109.66 ns 111.51 ns] Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) high mild 5 (5.00%) high severe sync/no-hook/component - host-to-wasm - typed - nop-params-and-results time: [79.968 ns 80.756 ns 81.728 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high severe sync/no-hook/component - host-to-wasm - untyped - nop-params-and-results time: [210.27 ns 211.72 ns 213.34 ns] Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe sync/hook-sync/component - host-to-wasm - typed - nop time: [76.840 ns 77.295 ns 77.770 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe sync/hook-sync/component - host-to-wasm - untyped - nop time: [109.63 ns 110.42 ns 111.26 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe sync/hook-sync/component - host-to-wasm - typed - nop-params-and-results time: [81.324 ns 82.344 ns 83.663 ns] Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) high mild 3 (3.00%) high severe sync/hook-sync/component - host-to-wasm - untyped - nop-params-and-results time: [211.84 ns 215.06 ns 219.22 ns] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe async/no-hook/component - host-to-wasm - typed - nop time: [23.759 µs 23.969 µs 24.221 µs] Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) high mild 10 (10.00%) high severe async/no-hook/component - host-to-wasm - untyped - nop time: [23.941 µs 24.093 µs 24.254 µs] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe async/no-hook/component - host-to-wasm - typed - nop-params-and-results time: [24.286 µs 24.459 µs 24.629 µs] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe async/no-hook/component - host-to-wasm - untyped - nop-params-and-results time: [24.258 µs 24.390 µs 24.528 µs] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe async/hook-sync/component - host-to-wasm - typed - nop time: [24.055 µs 24.224 µs 24.408 µs] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe async/hook-sync/component - host-to-wasm - untyped - nop time: [24.217 µs 24.364 µs 24.517 µs] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe async/hook-sync/component - host-to-wasm - typed - nop-params-and-results time: [24.207 µs 24.331 µs 24.463 µs] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe async/hook-sync/component - host-to-wasm - untyped - nop-params-and-results time: [24.607 µs 24.767 µs 24.936 µs] Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe async-pool/no-hook/component - host-to-wasm - typed - nop time: [456.89 ns 459.65 ns 462.68 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe async-pool/no-hook/component - host-to-wasm - untyped - nop time: [490.07 ns 492.87 ns 495.88 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe async-pool/no-hook/component - host-to-wasm - typed - nop-params-and-results time: [471.68 ns 475.01 ns 478.59 ns] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe async-pool/no-hook/component - host-to-wasm - untyped - nop-params-and-results time: [597.02 ns 600.61 ns 604.53 ns] Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) high mild 3 (3.00%) high severe async-pool/hook-sync/component - host-to-wasm - typed - nop time: [458.06 ns 460.82 ns 463.77 ns] Found 6 outliers among 100 measurements (6.00%) 6 (6.00%) high severe async-pool/hook-sync/component - host-to-wasm - untyped - nop time: [494.20 ns 497.65 ns 501.48 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high severe async-pool/hook-sync/component - host-to-wasm - typed - nop-params-and-results time: [472.40 ns 476.08 ns 480.10 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe async-pool/hook-sync/component - host-to-wasm - untyped - nop-params-and-results time: [598.55 ns 603.79 ns 610.18 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe sync/no-hook/component - wasm-to-host - typed - nop time: [33.614 ns 33.872 ns 34.170 ns] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) high mild 8 (8.00%) high severe sync/no-hook/component - wasm-to-host - typed - nop-params-and-results time: [37.416 ns 37.700 ns 38.002 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe sync/no-hook/component - wasm-to-host - untyped - nop time: [58.126 ns 58.478 ns 58.846 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe sync/no-hook/component - wasm-to-host - untyped - nop-params-and-results time: [170.14 ns 171.33 ns 172.68 ns] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe sync/hook-sync/component - wasm-to-host - typed - nop time: [33.336 ns 33.556 ns 33.796 ns] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe sync/hook-sync/component - wasm-to-host - typed - nop-params-and-results time: [37.399 ns 37.654 ns 37.904 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe sync/hook-sync/component - wasm-to-host - untyped - nop time: [58.438 ns 58.924 ns 59.485 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe sync/hook-sync/component - wasm-to-host - untyped - nop-params-and-results time: [169.36 ns 170.44 ns 171.60 ns] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe async/no-hook/component - wasm-to-host - typed - nop time: [33.882 ns 34.198 ns 34.573 ns] Found 11 outliers among 100 measurements (11.00%) 6 (6.00%) high mild 5 (5.00%) high severe async/no-hook/component - wasm-to-host - typed - nop-params-and-results time: [37.407 ns 37.820 ns 38.371 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe async/no-hook/component - wasm-to-host - untyped - nop time: [58.400 ns 58.937 ns 59.537 ns] Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe async/no-hook/component - wasm-to-host - untyped - nop-params-and-results time: [170.15 ns 171.72 ns 173.52 ns] Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe async/no-hook/component - wasm-to-host - async-typed - nop time: [48.383 ns 48.801 ns 49.317 ns] Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe async/no-hook/component - wasm-to-host - async-typed - nop-params-and-results time: [59.723 ns 60.158 ns 60.657 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe async/hook-sync/component - wasm-to-host - typed - nop time: [33.537 ns 34.056 ns 34.742 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe async/hook-sync/component - wasm-to-host - typed - nop-params-and-results time: [37.390 ns 37.888 ns 38.562 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe async/hook-sync/component - wasm-to-host - untyped - nop time: [58.506 ns 58.906 ns 59.361 ns] Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe async/hook-sync/component - wasm-to-host - untyped - nop-params-and-results time: [170.70 ns 172.62 ns 174.80 ns] Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) high mild 3 (3.00%) high severe async/hook-sync/component - wasm-to-host - async-typed - nop time: [48.308 ns 48.764 ns 49.267 ns] Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe async/hook-sync/component - wasm-to-host - async-typed - nop-params-and-results time: [57.503 ns 57.887 ns 58.307 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe async-pool/no-hook/component - wasm-to-host - typed - nop time: [33.473 ns 33.792 ns 34.142 ns] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe async-pool/no-hook/component - wasm-to-host - typed - nop-params-and-results time: [37.523 ns 38.040 ns 38.638 ns] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe async-pool/no-hook/component - wasm-to-host - untyped - nop time: [57.989 ns 58.350 ns 58.737 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high severe async-pool/no-hook/component - wasm-to-host - untyped - nop-params-and-results time: [169.55 ns 170.93 ns 172.48 ns] Found 7 outliers among 100 measurements (7.00%) 4 (4.00%) high mild 3 (3.00%) high severe async-pool/no-hook/component - wasm-to-host - async-typed - nop time: [48.323 ns 48.700 ns 49.144 ns] Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) high mild 4 (4.00%) high severe async-pool/no-hook/component - wasm-to-host - async-typed - nop-params-and-results time: [57.521 ns 58.090 ns 58.739 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe async-pool/hook-sync/component - wasm-to-host - typed - nop time: [33.379 ns 33.602 ns 33.838 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe async-pool/hook-sync/component - wasm-to-host - typed - nop-params-and-results time: [37.361 ns 37.608 ns 37.857 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe async-pool/hook-sync/component - wasm-to-host - untyped - nop time: [58.523 ns 58.848 ns 59.180 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high severe async-pool/hook-sync/component - wasm-to-host - untyped - nop-params-and-results time: [170.59 ns 171.57 ns 172.63 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high severe async-pool/hook-sync/component - wasm-to-host - async-typed - nop time: [48.265 ns 48.520 ns 48.794 ns] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe async-pool/hook-sync/component - wasm-to-host - async-typed - nop-params-and-results time: [57.619 ns 57.918 ns 58.234 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe ```
--- benches/call.rs | 472 +++++++++++++++++- crates/environ/src/component/types.rs | 1 + crates/runtime/src/component.rs | 2 + crates/runtime/src/component/resources.rs | 2 + crates/wasmtime/src/component/component.rs | 1 + crates/wasmtime/src/component/func.rs | 2 + crates/wasmtime/src/component/func/options.rs | 5 + crates/wasmtime/src/component/func/typed.rs | 14 + crates/wasmtime/src/component/instance.rs | 4 + crates/wasmtime/src/store.rs | 1 + 10 files changed, 480 insertions(+), 24 deletions(-) diff --git a/benches/call.rs b/benches/call.rs index 8c1b221422bd..90f9beca60b2 100644 --- a/benches/call.rs +++ b/benches/call.rs @@ -13,6 +13,9 @@ criterion_group!(benches, measure_execution_time); fn measure_execution_time(c: &mut Criterion) { host_to_wasm(c); wasm_to_host(c); + + #[cfg(feature = "component-model")] + component::measure_execution_time(c); } #[derive(Copy, Clone)] @@ -40,6 +43,10 @@ impl IsAsync { fn engines() -> Vec<(Engine, IsAsync)> { let mut config = Config::new(); + + #[cfg(feature = "component-model")] + config.wasm_component_model(true); + vec![ (Engine::new(&config).unwrap(), IsAsync::No), ( @@ -115,7 +122,7 @@ fn bench_host_to_wasm( { // Benchmark the "typed" version, which should be faster than the versions // below. - c.bench_function(&format!("host-to-wasm - typed - {}", name), |b| { + c.bench_function(&format!("core - host-to-wasm - typed - {}", name), |b| { let typed = instance .get_typed_func::(&mut *store, name) .unwrap(); @@ -131,7 +138,7 @@ fn bench_host_to_wasm( // Benchmark the "untyped" version which should be the slowest of the three // here, but not unduly slow. - c.bench_function(&format!("host-to-wasm - untyped - {}", name), |b| { + c.bench_function(&format!("core - host-to-wasm - untyped - {}", name), |b| { let untyped = instance.get_func(&mut *store, name).unwrap(); let params = typed_params.to_vals(); let expected_results = typed_results.to_vals(); @@ -156,26 +163,29 @@ fn bench_host_to_wasm( // Benchmark the "unchecked" version which should be between the above two, // but is unsafe. - c.bench_function(&format!("host-to-wasm - unchecked - {}", name), |b| { - let untyped = instance.get_func(&mut *store, name).unwrap(); - let params = typed_params.to_vals(); - let results = typed_results.to_vals(); - let mut space = vec![ValRaw::i32(0); params.len().max(results.len())]; - b.iter(|| unsafe { - for (i, param) in params.iter().enumerate() { - space[i] = param.to_raw(&mut *store); - } - untyped - .call_unchecked(&mut *store, space.as_mut_ptr(), space.len()) - .unwrap(); - for (i, expected) in results.iter().enumerate() { - assert_vals_eq( - expected, - &Val::from_raw(&mut *store, space[i], expected.ty()), - ); - } - }) - }); + c.bench_function( + &format!("core - host-to-wasm - unchecked - {}", name), + |b| { + let untyped = instance.get_func(&mut *store, name).unwrap(); + let params = typed_params.to_vals(); + let results = typed_results.to_vals(); + let mut space = vec![ValRaw::i32(0); params.len().max(results.len())]; + b.iter(|| unsafe { + for (i, param) in params.iter().enumerate() { + space[i] = param.to_raw(&mut *store); + } + untyped + .call_unchecked(&mut *store, space.as_mut_ptr(), space.len()) + .unwrap(); + for (i, expected) in results.iter().enumerate() { + assert_vals_eq( + expected, + &Val::from_raw(&mut *store, space[i], expected.ty()), + ); + } + }) + }, + ); } /// Benchmarks the overhead of calling the host from WebAssembly itself @@ -368,7 +378,7 @@ fn wasm_to_host(c: &mut Criterion) { desc: &str, is_async: IsAsync, ) { - group.bench_function(&format!("wasm-to-host - {} - nop", desc), |b| { + group.bench_function(&format!("core - wasm-to-host - {} - nop", desc), |b| { let run = instance .get_typed_func::(&mut *store, "run-nop") .unwrap(); @@ -383,7 +393,7 @@ fn wasm_to_host(c: &mut Criterion) { }) }); group.bench_function( - &format!("wasm-to-host - {} - nop-params-and-results", desc), + &format!("core - wasm-to-host - {} - nop-params-and-results", desc), |b| { let run = instance .get_typed_func::(&mut *store, "run-nop-params-and-results") @@ -468,3 +478,417 @@ fn dummy_waker() -> Waker { assert_eq!(ptr as usize, 5); } } + +#[cfg(feature = "component-model")] +mod component { + use super::*; + use wasmtime::component::{self, Component}; + + pub fn measure_execution_time(c: &mut Criterion) { + host_to_wasm(c); + wasm_to_host(c); + } + + trait ToComponentVal { + fn to_component_val(&self) -> component::Val; + } + + impl ToComponentVal for u32 { + fn to_component_val(&self) -> component::Val { + component::Val::U32(*self) + } + } + + impl ToComponentVal for u64 { + fn to_component_val(&self) -> component::Val { + component::Val::U64(*self) + } + } + + impl ToComponentVal for f32 { + fn to_component_val(&self) -> component::Val { + component::Val::Float32(*self) + } + } + + trait ToComponentVals { + fn to_component_vals(&self) -> Vec; + } + + macro_rules! tuples { + ($($t:ident)*) => ( + #[allow(non_snake_case)] + impl<$($t:Copy + ToComponentVal,)*> ToComponentVals for ($($t,)*) { + fn to_component_vals(&self) -> Vec { + let mut _dst = Vec::new(); + let ($($t,)*) = *self; + $(_dst.push($t.to_component_val());)* + _dst + } + } + ) + } + + tuples!(); + tuples!(A); + tuples!(A B); + tuples!(A B C); + + fn host_to_wasm(c: &mut Criterion) { + for (engine, is_async) in engines() { + let mut store = Store::new(&engine, ()); + + let component = Component::new( + &engine, + r#" + (component + (core module $m + (func (export "nop")) + (func (export "nop-params-and-results") (param i32 i64) (result f32) + f32.const 0 + ) + ) + (core instance $i (instantiate $m)) + (func (export "nop") + (canon lift (core func $i "nop")) + ) + (func (export "nop-params-and-results") (param "x" u32) (param "y" u64) (result "z" float32) + (canon lift (core func $i "nop-params-and-results")) + ) + ) + "#, + ) + .unwrap(); + + let linker = component::Linker::<()>::new(&engine); + let instance = if is_async.use_async() { + run_await(linker.instantiate_async(&mut store, &component)).unwrap() + } else { + linker.instantiate(&mut store, &component).unwrap() + }; + + let bench_calls = |group: &mut BenchmarkGroup<'_, WallTime>, store: &mut Store<()>| { + // Bench the overhead of a function that has no parameters or results + bench_host_to_wasm::<(), ()>(group, store, &instance, is_async, "nop", (), ()); + // Bench the overhead of a function that has some parameters and just + // one result (will use the raw system-v convention on applicable + // platforms). + bench_host_to_wasm::<(u32, u64), (f32,)>( + group, + store, + &instance, + is_async, + "nop-params-and-results", + (0, 0), + (0.0,), + ); + }; + + // Bench once without any call hooks configured + let name = format!("{}/no-hook", is_async.desc()); + bench_calls(&mut c.benchmark_group(&name), &mut store); + + // Bench again with a "call hook" enabled + store.call_hook(|_, _| Ok(())); + let name = format!("{}/hook-sync", is_async.desc()); + bench_calls(&mut c.benchmark_group(&name), &mut store); + } + } + + fn bench_host_to_wasm( + c: &mut BenchmarkGroup<'_, WallTime>, + store: &mut Store<()>, + instance: &component::Instance, + is_async: IsAsync, + name: &str, + typed_params: Params, + typed_results: Results, + ) where + Params: + component::ComponentNamedList + ToComponentVals + component::Lower + Copy + Send + Sync, + Results: component::ComponentNamedList + + ToComponentVals + + component::Lift + + Copy + + PartialEq + + Debug + + Send + + Sync, + { + // Benchmark the "typed" version. + c.bench_function( + &format!("component - host-to-wasm - typed - {}", name), + |b| { + let typed = instance + .get_typed_func::(&mut *store, name) + .unwrap(); + b.iter(|| { + let results = if is_async.use_async() { + run_await(typed.call_async(&mut *store, typed_params)).unwrap() + } else { + typed.call(&mut *store, typed_params).unwrap() + }; + assert_eq!(results, typed_results); + if is_async.use_async() { + run_await(typed.post_return_async(&mut *store)).unwrap() + } else { + typed.post_return(&mut *store).unwrap() + } + }) + }, + ); + + // Benchmark the "untyped" version. + c.bench_function( + &format!("component - host-to-wasm - untyped - {}", name), + |b| { + let untyped = instance.get_func(&mut *store, name).unwrap(); + let params = typed_params.to_component_vals(); + let expected_results = typed_results.to_component_vals(); + let mut results = vec![component::Val::U32(0); expected_results.len()]; + b.iter(|| { + if is_async.use_async() { + run_await(untyped.call_async(&mut *store, ¶ms, &mut results)).unwrap(); + } else { + untyped.call(&mut *store, ¶ms, &mut results).unwrap(); + } + for (expected, actual) in expected_results.iter().zip(&results) { + assert_eq!(expected, actual); + } + if is_async.use_async() { + run_await(untyped.post_return_async(&mut *store)).unwrap() + } else { + untyped.post_return(&mut *store).unwrap() + } + }) + }, + ); + } + + fn wasm_to_host(c: &mut Criterion) { + let module = r#" + (component + (import "nop" (func $comp_nop)) + (import "nop-params-and-results" (func $comp_nop_params_and_results (param "x" u32) (param "y" u64) (result float32))) + + (core func $core_nop (canon lower (func $comp_nop))) + (core func $core_nop_params_and_results (canon lower (func $comp_nop_params_and_results))) + + (core module $m + ;; host imports with a variety of parameters/arguments + (import "" "nop" (func $nop)) + (import "" "nop-params-and-results" + (func $nop_params_and_results (param i32 i64) (result f32)) + ) + + ;; "runner functions" for each of the above imports. Each runner + ;; function takes the number of times to call the host function as + ;; the duration of this entire loop will be measured. + + (func (export "run-nop") (param i64) + loop + call $nop + + local.get 0 ;; decrement & break if necessary + i64.const -1 + i64.add + local.tee 0 + i64.const 0 + i64.ne + br_if 0 + end + ) + + (func (export "run-nop-params-and-results") (param i64) + loop + i32.const 0 ;; always zero parameters + i64.const 0 + call $nop_params_and_results + f32.const 0 ;; assert the correct result + f32.eq + i32.eqz + if + unreachable + end + + local.get 0 ;; decrement & break if necessary + i64.const -1 + i64.add + local.tee 0 + i64.const 0 + i64.ne + br_if 0 + end + ) + ) + + (core instance $i (instantiate $m (with "" (instance + (export "nop" (func $core_nop)) + (export "nop-params-and-results" (func $core_nop_params_and_results)) + )))) + + (func (export "run-nop") (param "i" u64) + (canon lift (core func $i "run-nop")) + ) + (func (export "run-nop-params-and-results") (param "i" u64) + (canon lift (core func $i "run-nop-params-and-results")) + ) + ) + "#; + + for (engine, is_async) in engines() { + let mut store = Store::new(&engine, ()); + let component = component::Component::new(&engine, module).unwrap(); + + bench_calls( + &mut c.benchmark_group(&format!("{}/no-hook", is_async.desc())), + &mut store, + &component, + is_async, + ); + store.call_hook(|_, _| Ok(())); + bench_calls( + &mut c.benchmark_group(&format!("{}/hook-sync", is_async.desc())), + &mut store, + &component, + is_async, + ); + } + + // Given a `Store` will create various instances hooked up to different ways + // of defining host imports to benchmark their overhead. + fn bench_calls( + group: &mut BenchmarkGroup<'_, WallTime>, + store: &mut Store<()>, + component: &component::Component, + is_async: IsAsync, + ) { + let engine = store.engine().clone(); + let mut typed = component::Linker::new(&engine); + typed.root().func_wrap("nop", |_, ()| Ok(())).unwrap(); + typed + .root() + .func_wrap("nop-params-and-results", |_, (x, y): (u32, u64)| { + assert_eq!(x, 0); + assert_eq!(y, 0); + Ok((0.0f32,)) + }) + .unwrap(); + let instance = if is_async.use_async() { + run_await(typed.instantiate_async(&mut *store, &component)).unwrap() + } else { + typed.instantiate(&mut *store, &component).unwrap() + }; + bench_instance(group, store, &instance, "typed", is_async); + + let mut untyped = component::Linker::new(&engine); + untyped + .root() + .func_new(&component, "nop", |_, _, _| Ok(())) + .unwrap(); + untyped + .root() + .func_new( + &component, + "nop-params-and-results", + |_caller, params, results| { + assert_eq!(params.len(), 2); + match params[0] { + component::Val::U32(0) => {} + _ => unreachable!(), + } + match params[1] { + component::Val::U64(0) => {} + _ => unreachable!(), + } + assert_eq!(results.len(), 1); + results[0] = component::Val::Float32(0.0); + Ok(()) + }, + ) + .unwrap(); + let instance = if is_async.use_async() { + run_await(untyped.instantiate_async(&mut *store, &component)).unwrap() + } else { + untyped.instantiate(&mut *store, &component).unwrap() + }; + bench_instance(group, store, &instance, "untyped", is_async); + + // Only define async host imports if allowed + if !is_async.use_async() { + return; + } + + let mut typed = component::Linker::new(&engine); + typed + .root() + .func_wrap_async("nop", |caller, ()| { + Box::new(async { + drop(caller); + Ok(()) + }) + }) + .unwrap(); + typed + .root() + .func_wrap_async("nop-params-and-results", |_caller, (x, y): (u32, u64)| { + Box::new(async move { + assert_eq!(x, 0); + assert_eq!(y, 0); + Ok((0.0f32,)) + }) + }) + .unwrap(); + let instance = run_await(typed.instantiate_async(&mut *store, &component)).unwrap(); + bench_instance(group, store, &instance, "async-typed", is_async); + } + + // Given a specific instance executes all of the "runner functions" + fn bench_instance( + group: &mut BenchmarkGroup<'_, WallTime>, + store: &mut Store<()>, + instance: &component::Instance, + desc: &str, + is_async: IsAsync, + ) { + group.bench_function(&format!("component - wasm-to-host - {} - nop", desc), |b| { + let run = instance + .get_typed_func::<(u64,), ()>(&mut *store, "run-nop") + .unwrap(); + b.iter_custom(|iters| { + let start = Instant::now(); + if is_async.use_async() { + run_await(run.call_async(&mut *store, (iters,))).unwrap(); + run_await(run.post_return_async(&mut *store)).unwrap(); + } else { + run.call(&mut *store, (iters,)).unwrap(); + run.post_return(&mut *store).unwrap(); + } + start.elapsed() + }) + }); + group.bench_function( + &format!( + "component - wasm-to-host - {} - nop-params-and-results", + desc + ), + |b| { + let run = instance + .get_typed_func::<(u64,), ()>(&mut *store, "run-nop-params-and-results") + .unwrap(); + b.iter_custom(|iters| { + let start = Instant::now(); + if is_async.use_async() { + run_await(run.call_async(&mut *store, (iters,))).unwrap(); + run_await(run.post_return_async(&mut *store)).unwrap(); + } else { + run.call(&mut *store, (iters,)).unwrap(); + run.post_return(&mut *store).unwrap(); + } + start.elapsed() + }) + }, + ); + } + } +} diff --git a/crates/environ/src/component/types.rs b/crates/environ/src/component/types.rs index 9d481f162b4f..db751bc2ef07 100644 --- a/crates/environ/src/component/types.rs +++ b/crates/environ/src/component/types.rs @@ -319,6 +319,7 @@ macro_rules! impl_index { ($(impl Index<$ty:ident> for ComponentTypes { $output:ident => $field:ident })*) => ($( impl std::ops::Index<$ty> for ComponentTypes { type Output = $output; + #[inline] fn index(&self, idx: $ty) -> &$output { &self.$field[idx] } diff --git a/crates/runtime/src/component.rs b/crates/runtime/src/component.rs index 39f9f61eaef3..fc0d4bf2a983 100644 --- a/crates/runtime/src/component.rs +++ b/crates/runtime/src/component.rs @@ -244,6 +244,7 @@ impl ComponentInstance { /// Returns a pointer to the "may leave" flag for this instance specified /// for canonical lowering and lifting operations. + #[inline] pub fn instance_flags(&self, instance: RuntimeComponentInstanceIndex) -> InstanceFlags { unsafe { let ptr = self @@ -560,6 +561,7 @@ impl ComponentInstance { } /// Returns the runtime state of resources associated with this component. + #[inline] pub fn component_resource_tables( &mut self, ) -> &mut PrimaryMap { diff --git a/crates/runtime/src/component/resources.rs b/crates/runtime/src/component/resources.rs index 378fe1e24926..4c22db5569b0 100644 --- a/crates/runtime/src/component/resources.rs +++ b/crates/runtime/src/component/resources.rs @@ -246,6 +246,7 @@ impl ResourceTables<'_> { /// Enters a new calling context, starting a fresh count of borrows and /// such. + #[inline] pub fn enter_call(&mut self) { self.calls.scopes.push(CallContext::default()); } @@ -255,6 +256,7 @@ impl ResourceTables<'_> { /// This requires all information to be available within this /// `ResourceTables` and is only called during lowering/lifting operations /// at this time. + #[inline] pub fn exit_call(&mut self) -> Result<()> { let cx = self.calls.scopes.pop().unwrap(); if cx.borrow_count > 0 { diff --git a/crates/wasmtime/src/component/component.rs b/crates/wasmtime/src/component/component.rs index 47b29d988f6a..013f93bd79e0 100644 --- a/crates/wasmtime/src/component/component.rs +++ b/crates/wasmtime/src/component/component.rs @@ -291,6 +291,7 @@ impl Component { &self.inner.static_modules[idx] } + #[inline] pub(crate) fn types(&self) -> &Arc { self.inner.component_types() } diff --git a/crates/wasmtime/src/component/func.rs b/crates/wasmtime/src/component/func.rs index 9a10b3d0ad91..0b69aff34d8c 100644 --- a/crates/wasmtime/src/component/func.rs +++ b/crates/wasmtime/src/component/func.rs @@ -564,6 +564,7 @@ impl Func { /// /// Panics if this is called on a function in an asynchronous store. /// This only works with functions defined within a synchronous store. + #[inline] pub fn post_return(&self, mut store: impl AsContextMut) -> Result<()> { let store = store.as_context_mut(); assert!( @@ -596,6 +597,7 @@ impl Func { store.on_fiber(|store| self.post_return_impl(store)).await? } + #[inline] fn post_return_impl(&self, mut store: impl AsContextMut) -> Result<()> { let mut store = store.as_context_mut(); let data = &mut store.0[self.0]; diff --git a/crates/wasmtime/src/component/func/options.rs b/crates/wasmtime/src/component/func/options.rs index 08104dcb9d18..6cb038f4f692 100644 --- a/crates/wasmtime/src/component/func/options.rs +++ b/crates/wasmtime/src/component/func/options.rs @@ -348,12 +348,14 @@ impl<'a, T> LowerContext<'a, T> { /// Begins a call into the component instance, starting recording of /// metadata related to resource borrowing. + #[inline] pub fn enter_call(&mut self) { self.resource_tables().enter_call() } /// Completes a call into the component instance, validating that it's ok to /// complete by ensuring the are no remaining active borrows. + #[inline] pub fn exit_call(&mut self) -> Result<()> { self.resource_tables().exit_call() } @@ -389,6 +391,7 @@ impl<'a> LiftContext<'a> { /// /// This is unsafe for the same reasons as `LowerContext::new` where the /// validity of `instance` is required to be upheld by the caller. + #[inline] pub unsafe fn new( store: &'a mut StoreOpaque, options: &'a Options, @@ -499,11 +502,13 @@ impl<'a> LiftContext<'a> { } /// Same as `LowerContext::enter_call` + #[inline] pub fn enter_call(&mut self) { self.resource_tables().enter_call() } /// Same as `LiftContext::enter_call` + #[inline] pub fn exit_call(&mut self) -> Result<()> { self.resource_tables().exit_call() } diff --git a/crates/wasmtime/src/component/func/typed.rs b/crates/wasmtime/src/component/func/typed.rs index 3830b025b2a6..1b43cec321d7 100644 --- a/crates/wasmtime/src/component/func/typed.rs +++ b/crates/wasmtime/src/component/func/typed.rs @@ -653,10 +653,12 @@ forward_lowers! { macro_rules! forward_string_lifts { ($($a:ty,)*) => ($( unsafe impl Lift for $a { + #[inline] fn lift(cx: &mut LiftContext<'_>, ty: InterfaceType, src: &Self::Lower) -> Result { Ok(::lift(cx, ty, src)?.to_str_from_memory(cx.memory())?.into()) } + #[inline] fn load(cx: &mut LiftContext<'_>, ty: InterfaceType, bytes: &[u8]) -> Result { Ok(::load(cx, ty, bytes)?.to_str_from_memory(cx.memory())?.into()) } @@ -712,6 +714,7 @@ macro_rules! integers { } unsafe impl Lower for $primitive { + #[inline] fn lower( &self, _cx: &mut LowerContext<'_, T>, @@ -723,6 +726,7 @@ macro_rules! integers { Ok(()) } + #[inline] fn store( &self, cx: &mut LowerContext<'_, T>, @@ -839,6 +843,7 @@ macro_rules! floats { } unsafe impl Lower for $float { + #[inline] fn lower( &self, _cx: &mut LowerContext<'_, T>, @@ -850,6 +855,7 @@ macro_rules! floats { Ok(()) } + #[inline] fn store( &self, cx: &mut LowerContext<'_, T>, @@ -958,6 +964,7 @@ unsafe impl ComponentType for char { } unsafe impl Lower for char { + #[inline] fn lower( &self, _cx: &mut LowerContext<'_, T>, @@ -969,6 +976,7 @@ unsafe impl Lower for char { Ok(()) } + #[inline] fn store( &self, cx: &mut LowerContext<'_, T>, @@ -1291,6 +1299,7 @@ unsafe impl ComponentType for WasmStr { } unsafe impl Lift for WasmStr { + #[inline] fn lift(cx: &mut LiftContext<'_>, ty: InterfaceType, src: &Self::Lower) -> Result { debug_assert!(matches!(ty, InterfaceType::String)); // FIXME: needs memory64 treatment @@ -1300,6 +1309,7 @@ unsafe impl Lift for WasmStr { WasmStr::new(ptr, len, cx) } + #[inline] fn load(cx: &mut LiftContext<'_>, ty: InterfaceType, bytes: &[u8]) -> Result { debug_assert!(matches!(ty, InterfaceType::String)); debug_assert!((bytes.as_ptr() as usize) % (Self::ALIGN32 as usize) == 0); @@ -2137,6 +2147,7 @@ where T: Lift, E: Lift, { + #[inline] fn lift(cx: &mut LiftContext<'_>, ty: InterfaceType, src: &Self::Lower) -> Result { let (ok, err) = match ty { InterfaceType::Result(ty) => { @@ -2171,6 +2182,7 @@ where }) } + #[inline] fn load(cx: &mut LiftContext<'_>, ty: InterfaceType, bytes: &[u8]) -> Result { debug_assert!((bytes.as_ptr() as usize) % (Self::ALIGN32 as usize) == 0); let discrim = bytes[0]; @@ -2305,6 +2317,7 @@ macro_rules! impl_component_ty_for_tuples { unsafe impl<$($t,)*> Lift for ($($t,)*) where $($t: Lift),* { + #[inline] fn lift(cx: &mut LiftContext<'_>, ty: InterfaceType, _src: &Self::Lower) -> Result { let types = match ty { InterfaceType::Tuple(t) => &cx.types[t].types, @@ -2320,6 +2333,7 @@ macro_rules! impl_component_ty_for_tuples { )*)) } + #[inline] fn load(cx: &mut LiftContext<'_>, ty: InterfaceType, bytes: &[u8]) -> Result { debug_assert!((bytes.as_ptr() as usize) % (Self::ALIGN32 as usize) == 0); let types = match ty { diff --git a/crates/wasmtime/src/component/instance.rs b/crates/wasmtime/src/component/instance.rs index a6ee266c2f2f..b81afef82272 100644 --- a/crates/wasmtime/src/component/instance.rs +++ b/crates/wasmtime/src/component/instance.rs @@ -194,18 +194,22 @@ impl InstanceData { instance.get_export_by_index(idx) } + #[inline] pub fn instance(&self) -> &ComponentInstance { &self.state } + #[inline] pub fn instance_ptr(&self) -> *mut ComponentInstance { self.state.instance_ptr() } + #[inline] pub fn component_types(&self) -> &Arc { self.component.types() } + #[inline] pub fn ty(&self) -> InstanceType<'_> { InstanceType::new(self.instance()) } diff --git a/crates/wasmtime/src/store.rs b/crates/wasmtime/src/store.rs index 437db281e33c..0f78c579c7d1 100644 --- a/crates/wasmtime/src/store.rs +++ b/crates/wasmtime/src/store.rs @@ -1580,6 +1580,7 @@ at https://bytecodealliance.org/security. std::process::abort(); } + #[inline] #[cfg(feature = "component-model")] pub(crate) fn component_calls_and_host_table( &mut self, From 5928278103891852b48fd5fed941bbb2a091c93c Mon Sep 17 00:00:00 2001 From: Afonso Bordado Date: Sat, 9 Sep 2023 18:27:10 +0100 Subject: [PATCH 02/14] riscv64: Delete unused code (#6984) * riscv64: Delete ECall instruction * riscv64: Delete `fence.i` instruction * riscv64: Delete `emit_fneg` --- cranelift/codegen/src/isa/riscv64/inst.isle | 4 ---- .../codegen/src/isa/riscv64/inst/emit.rs | 19 ------------------- .../src/isa/riscv64/inst/emit_tests.rs | 2 -- cranelift/codegen/src/isa/riscv64/inst/mod.rs | 4 ---- 4 files changed, 29 deletions(-) diff --git a/cranelift/codegen/src/isa/riscv64/inst.isle b/cranelift/codegen/src/isa/riscv64/inst.isle index 624579eef987..3d919104af9e 100644 --- a/cranelift/codegen/src/isa/riscv64/inst.isle +++ b/cranelift/codegen/src/isa/riscv64/inst.isle @@ -174,10 +174,6 @@ (pred FenceReq) (succ FenceReq)) - (FenceI) - - (ECall) - (EBreak) ;; An instruction guaranteed to always be undefined and to trigger an illegal instruction at diff --git a/cranelift/codegen/src/isa/riscv64/inst/emit.rs b/cranelift/codegen/src/isa/riscv64/inst/emit.rs index a61de15543d1..049c4bf49dbc 100644 --- a/cranelift/codegen/src/isa/riscv64/inst/emit.rs +++ b/cranelift/codegen/src/isa/riscv64/inst/emit.rs @@ -206,19 +206,6 @@ impl Inst { }); insts } - pub(crate) fn emit_fneg(rd: Writable, rs: Reg, ty: Type) -> Inst { - Inst::FpuRRR { - alu_op: if ty == F32 { - FpuOPRRR::FsgnjnS - } else { - FpuOPRRR::FsgnjnD - }, - frm: None, - rd: rd, - rs1: rs, - rs2: rs, - } - } pub(crate) fn lower_br_icmp( cc: IntCC, @@ -352,8 +339,6 @@ impl Inst { | Inst::Mov { .. } | Inst::MovFromPReg { .. } | Inst::Fence { .. } - | Inst::FenceI - | Inst::ECall | Inst::EBreak | Inst::Udf { .. } | Inst::FpuRR { .. } @@ -1226,7 +1211,6 @@ impl MachInstEmit for Inst { sink.put4(x); } - &Inst::FenceI => sink.put4(0x0000100f), &Inst::Auipc { rd, imm } => { let rd = allocs.next_writable(rd); let x = enc_auipc(rd, imm); @@ -1349,9 +1333,6 @@ impl MachInstEmit for Inst { let x = enc_jalr(rd, base, offset); sink.put4(x); } - &Inst::ECall => { - sink.put4(0x00000073); - } &Inst::EBreak => { sink.put4(0x00100073); } diff --git a/cranelift/codegen/src/isa/riscv64/inst/emit_tests.rs b/cranelift/codegen/src/isa/riscv64/inst/emit_tests.rs index 5349864c2c27..d44b7c3bbfeb 100644 --- a/cranelift/codegen/src/isa/riscv64/inst/emit_tests.rs +++ b/cranelift/codegen/src/isa/riscv64/inst/emit_tests.rs @@ -2049,8 +2049,6 @@ fn test_riscv64_binemit() { "fence w,r", 0x120000f, )); - insns.push(TestUnit::new(Inst::FenceI {}, "fence.i", 0x100f)); - insns.push(TestUnit::new(Inst::ECall {}, "ecall", 0x73)); insns.push(TestUnit::new(Inst::EBreak {}, "ebreak", 0x100073)); insns.push(TestUnit::new( diff --git a/cranelift/codegen/src/isa/riscv64/inst/mod.rs b/cranelift/codegen/src/isa/riscv64/inst/mod.rs index 75a836da088c..ec56b568d507 100644 --- a/cranelift/codegen/src/isa/riscv64/inst/mod.rs +++ b/cranelift/codegen/src/isa/riscv64/inst/mod.rs @@ -505,8 +505,6 @@ fn riscv64_get_operands VReg>(inst: &Inst, collector: &mut Operan collector.reg_def(rd); } &Inst::Fence { .. } => {} - &Inst::FenceI => {} - &Inst::ECall => {} &Inst::EBreak => {} &Inst::Udf { .. } => {} &Inst::FpuRR { rd, rs, .. } => { @@ -1748,7 +1746,6 @@ impl Inst { Inst::fence_req_to_string(succ), ) } - &MInst::FenceI => "fence.i".into(), &MInst::Select { ref dst, condition, @@ -1765,7 +1762,6 @@ impl Inst { } &MInst::Udf { trap_code } => format!("udf##trap_code={}", trap_code), &MInst::EBreak {} => String::from("ebreak"), - &MInst::ECall {} => String::from("ecall"), &Inst::VecAluRRRR { op, vd, From d8db07faf6620581168bd289c79fb08df39ac768 Mon Sep 17 00:00:00 2001 From: Afonso Bordado Date: Sat, 9 Sep 2023 18:42:53 +0100 Subject: [PATCH 03/14] cranelift: Fix `v{all,any}_true` and `vhigh_bits` instructions in the interpreter (#6985) * cranelift: Implement `vall_true` for floats in the interpreter * cranelift: Implement `vany_true` for floats in the interpreter * cranelift: Implement `vhigh_bits` for floats in the interpreter * cranelift: Forbid vector return types for `vhigh_bits` This instruction doesen't really make sense with a vector return type. The description also states that it returns a scalar integer so I suspect it wasn't intended to allow vector integers. * fuzzgen: Enable `v{all,any}_true` and `vhigh_bits` --- .../codegen/meta/src/shared/instructions.rs | 2 +- .../filetests/runtests/simd-valltrue.clif | 20 +++++++ .../filetests/runtests/simd-vanytrue.clif | 21 ++++++++ .../runtests/simd-vhighbits-float.clif | 26 +++++++++ cranelift/fuzzgen/src/function_generator.rs | 54 +------------------ cranelift/interpreter/src/step.rs | 18 ++++--- 6 files changed, 81 insertions(+), 60 deletions(-) create mode 100644 cranelift/filetests/filetests/runtests/simd-vhighbits-float.clif diff --git a/cranelift/codegen/meta/src/shared/instructions.rs b/cranelift/codegen/meta/src/shared/instructions.rs index 89aa5bb0c395..6727d9c8c345 100644 --- a/cranelift/codegen/meta/src/shared/instructions.rs +++ b/cranelift/codegen/meta/src/shared/instructions.rs @@ -1591,7 +1591,7 @@ pub(crate) fn define( &formats.unary, ) .operands_in(vec![Operand::new("a", TxN)]) - .operands_out(vec![Operand::new("x", Int)]), + .operands_out(vec![Operand::new("x", NarrowInt)]), ); ig.push( diff --git a/cranelift/filetests/filetests/runtests/simd-valltrue.clif b/cranelift/filetests/filetests/runtests/simd-valltrue.clif index 870687b7702c..436ad476f355 100644 --- a/cranelift/filetests/filetests/runtests/simd-valltrue.clif +++ b/cranelift/filetests/filetests/runtests/simd-valltrue.clif @@ -52,3 +52,23 @@ block0(v0: i64x2): ; run: %vall_true_i64x2([-1 0]) == 0 ; run: %vall_true_i64x2([-1 -1]) == 1 ; run: %vall_true_i64x2([0xffffffff_00000000 -1]) == 1 + +function %vall_true_f32x4(f32x4) -> i8 { +block0(v0: f32x4): + v1 = vall_true v0 + return v1 +} +; run: %vall_true_f32x4([0.0 0.0 0.0 0.0]) == 0 +; run: %vall_true_f32x4([0.0 -0.0 0.0 0.0]) == 0 +; run: %vall_true_f32x4([-0.0 -0.0 -0.0 -0.0]) == 1 +; run: %vall_true_f32x4([0x1.0 0x1.0 0x1.0 0x1.0]) == 1 + +function %vall_true_f64x2(f64x2) -> i8 { +block0(v0: f64x2): + v1 = vall_true v0 + return v1 +} +; run: %vall_true_f64x2([0.0 0.0]) == 0 +; run: %vall_true_f64x2([0.0 -0.0]) == 0 +; run: %vall_true_f64x2([-0.0 -0.0]) == 1 +; run: %vall_true_f64x2([0x1.0 0x1.0]) == 1 diff --git a/cranelift/filetests/filetests/runtests/simd-vanytrue.clif b/cranelift/filetests/filetests/runtests/simd-vanytrue.clif index cc9389a50720..7095523975cd 100644 --- a/cranelift/filetests/filetests/runtests/simd-vanytrue.clif +++ b/cranelift/filetests/filetests/runtests/simd-vanytrue.clif @@ -45,3 +45,24 @@ block0(v0: i64x2): ; run: %vany_true_i64x2([0 0]) == 0 ; run: %vany_true_i64x2([-1 0]) == 1 ; run: %vany_true_i64x2([-1 -1]) == 1 + +function %vany_true_f32x4(f32x4) -> i8 { +block0(v0: f32x4): + v1 = vany_true v0 + return v1 +} +; run: %vany_true_f32x4([0.0 0.0 0.0 0.0]) == 0 +; run: %vany_true_f32x4([0.0 -0.0 0.0 0.0]) == 1 +; run: %vany_true_f32x4([-0.0 -0.0 -0.0 -0.0]) == 1 +; run: %vany_true_f32x4([0x1.0 0x1.0 0x1.0 0x1.0]) == 1 + + +function %vany_true_f64x2(f64x2) -> i8 { +block0(v0: f64x2): + v1 = vany_true v0 + return v1 +} +; run: %vany_true_f64x2([0.0 0.0]) == 0 +; run: %vany_true_f64x2([0.0 -0.0]) == 1 +; run: %vany_true_f64x2([-0.0 -0.0]) == 1 +; run: %vany_true_f64x2([0x1.0 0x1.0]) == 1 diff --git a/cranelift/filetests/filetests/runtests/simd-vhighbits-float.clif b/cranelift/filetests/filetests/runtests/simd-vhighbits-float.clif new file mode 100644 index 000000000000..228e5cf06075 --- /dev/null +++ b/cranelift/filetests/filetests/runtests/simd-vhighbits-float.clif @@ -0,0 +1,26 @@ +test interpret +test run +target s390x +target x86_64 has_sse3 has_ssse3 has_sse41 +target x86_64 has_sse3 has_ssse3 has_sse41 has_avx +target riscv64gc has_v + +function %vhighbits_f32x4(f32x4) -> i8 { +block0(v0: f32x4): + v1 = vhigh_bits.i8 v0 + return v1 +} +; run: %vhighbits_f32x4([0.0 0.0 0.0 0.0]) == 0 +; run: %vhighbits_f32x4([0.0 -0.0 0.0 0.0]) == 2 +; run: %vhighbits_f32x4([-0.0 -0.0 -0.0 -0.0]) == 0xF +; run: %vhighbits_f32x4([0x1.0 0x1.0 0x1.0 0x1.0]) == 0 + +function %vhighbits_f64x2(f64x2) -> i8 { +block0(v0: f64x2): + v1 = vhigh_bits.i8 v0 + return v1 +} +; run: %vhighbits_f64x2([0.0 0.0]) == 0 +; run: %vhighbits_f64x2([0.0 -0.0]) == 2 +; run: %vhighbits_f64x2([-0.0 -0.0]) == 3 +; run: %vhighbits_f64x2([0x1.0 0x1.0]) == 0 diff --git a/cranelift/fuzzgen/src/function_generator.rs b/cranelift/fuzzgen/src/function_generator.rs index 4ef64320526a..a264155cad68 100644 --- a/cranelift/fuzzgen/src/function_generator.rs +++ b/cranelift/fuzzgen/src/function_generator.rs @@ -670,6 +670,7 @@ fn valid_for_target(triple: &Triple, op: Opcode, args: &[Type], rets: &[Type]) - ), // TODO (Opcode::Bitselect, &[_, _, _], &[F32 | F64]), + (Opcode::VhighBits, &[F32X4 | F64X2]), ) } @@ -896,7 +897,6 @@ static OPCODE_SIGNATURES: Lazy> = Lazy::new(|| { (Opcode::TableAddr), (Opcode::Null), (Opcode::X86Blendv), - (Opcode::VallTrue), (Opcode::IcmpImm), (Opcode::X86Pmulhrsw), (Opcode::IaddImm), @@ -924,58 +924,6 @@ static OPCODE_SIGNATURES: Lazy> = Lazy::new(|| { (Opcode::ScalarToVector), (Opcode::X86Pmaddubsw), (Opcode::X86Cvtt2dq), - (Opcode::VanyTrue, &[F32X4], &[I8]), - (Opcode::VanyTrue, &[F64X2], &[I8]), - (Opcode::VhighBits, &[F32X4], &[I8]), - (Opcode::VhighBits, &[F64X2], &[I8]), - (Opcode::VhighBits, &[I8X16], &[I16]), - (Opcode::VhighBits, &[I16X8], &[I16]), - (Opcode::VhighBits, &[I32X4], &[I16]), - (Opcode::VhighBits, &[I64X2], &[I16]), - (Opcode::VhighBits, &[F32X4], &[I16]), - (Opcode::VhighBits, &[F64X2], &[I16]), - (Opcode::VhighBits, &[I8X16], &[I32]), - (Opcode::VhighBits, &[I16X8], &[I32]), - (Opcode::VhighBits, &[I32X4], &[I32]), - (Opcode::VhighBits, &[I64X2], &[I32]), - (Opcode::VhighBits, &[F32X4], &[I32]), - (Opcode::VhighBits, &[F64X2], &[I32]), - (Opcode::VhighBits, &[I8X16], &[I64]), - (Opcode::VhighBits, &[I16X8], &[I64]), - (Opcode::VhighBits, &[I32X4], &[I64]), - (Opcode::VhighBits, &[I64X2], &[I64]), - (Opcode::VhighBits, &[F32X4], &[I64]), - (Opcode::VhighBits, &[F64X2], &[I64]), - (Opcode::VhighBits, &[I8X16], &[I128]), - (Opcode::VhighBits, &[I16X8], &[I128]), - (Opcode::VhighBits, &[I32X4], &[I128]), - (Opcode::VhighBits, &[I64X2], &[I128]), - (Opcode::VhighBits, &[F32X4], &[I128]), - (Opcode::VhighBits, &[F64X2], &[I128]), - (Opcode::VhighBits, &[I8X16], &[I8X16]), - (Opcode::VhighBits, &[I16X8], &[I8X16]), - (Opcode::VhighBits, &[I32X4], &[I8X16]), - (Opcode::VhighBits, &[I64X2], &[I8X16]), - (Opcode::VhighBits, &[F32X4], &[I8X16]), - (Opcode::VhighBits, &[F64X2], &[I8X16]), - (Opcode::VhighBits, &[I8X16], &[I16X8]), - (Opcode::VhighBits, &[I16X8], &[I16X8]), - (Opcode::VhighBits, &[I32X4], &[I16X8]), - (Opcode::VhighBits, &[I64X2], &[I16X8]), - (Opcode::VhighBits, &[F32X4], &[I16X8]), - (Opcode::VhighBits, &[F64X2], &[I16X8]), - (Opcode::VhighBits, &[I8X16], &[I32X4]), - (Opcode::VhighBits, &[I16X8], &[I32X4]), - (Opcode::VhighBits, &[I32X4], &[I32X4]), - (Opcode::VhighBits, &[I64X2], &[I32X4]), - (Opcode::VhighBits, &[F32X4], &[I32X4]), - (Opcode::VhighBits, &[F64X2], &[I32X4]), - (Opcode::VhighBits, &[I8X16], &[I64X2]), - (Opcode::VhighBits, &[I16X8], &[I64X2]), - (Opcode::VhighBits, &[I32X4], &[I64X2]), - (Opcode::VhighBits, &[I64X2], &[I64X2]), - (Opcode::VhighBits, &[F32X4], &[I64X2]), - (Opcode::VhighBits, &[F64X2], &[I64X2]), (Opcode::Umulhi, &[I128, I128], &[I128]), (Opcode::Smulhi, &[I128, I128], &[I128]), // https://github.com/bytecodealliance/wasmtime/issues/6073 diff --git a/cranelift/interpreter/src/step.rs b/cranelift/interpreter/src/step.rs index a8ae06f906ea..8f437587e11d 100644 --- a/cranelift/interpreter/src/step.rs +++ b/cranelift/interpreter/src/step.rs @@ -1009,7 +1009,10 @@ where Opcode::VhighBits => { // `ctrl_ty` controls the return type for this, so the input type // must be retrieved via `inst_context`. - let vector_type = inst_context.type_of(inst_context.args()[0]).unwrap(); + let vector_type = inst_context + .type_of(inst_context.args()[0]) + .unwrap() + .as_int(); let a = extractlanes(&arg(0), vector_type)?; let mut result: u128 = 0; for (i, val) in a.into_iter().enumerate() { @@ -1019,15 +1022,18 @@ where assign(DataValueExt::int(result as i128, ctrl_ty)?) } Opcode::VanyTrue => { - let lane_ty = ctrl_ty.lane_type(); + let simd_ty = ctrl_ty.as_int(); + let lane_ty = simd_ty.lane_type(); let init = DataValue::bool(false, true, lane_ty)?; - let any = fold_vector(arg(0), ctrl_ty, init.clone(), |acc, lane| acc.or(lane))?; + let any = fold_vector(arg(0), simd_ty, init.clone(), |acc, lane| acc.or(lane))?; assign(DataValue::bool(any != init, false, types::I8)?) } Opcode::VallTrue => assign(DataValue::bool( - !(arg(0).iter_lanes(ctrl_ty)?.try_fold(false, |acc, lane| { - Ok::(acc | lane.is_zero()?) - })?), + !(arg(0) + .iter_lanes(ctrl_ty.as_int())? + .try_fold(false, |acc, lane| { + Ok::(acc | lane.is_zero()?) + })?), false, types::I8, )?), From 9f00198611537c7a9b384c9f9e3db85e0ca15123 Mon Sep 17 00:00:00 2001 From: Trevor Elliott Date: Sun, 10 Sep 2023 15:14:09 -0700 Subject: [PATCH 04/14] winch: Support abs and neg for f32 and f64 on x64 (#6982) * winch: Support f32.abs and f64.abs on x64 Co-authored-by: Nick Fitzgerald * Add an implementation of f32.neg and f64.neg * Enable spec tests for winch with f{32,64}.{neg,abs} * Enable differential fuzzing for f{32,64}.{neg,abs} for winch * Comments from code review --------- Co-authored-by: Nick Fitzgerald --- fuzz/fuzz_targets/differential.rs | 6 +- tests/misc_testsuite/winch/f32_bitwise.wast | 50 +++++++++++ tests/misc_testsuite/winch/f64_bitwise.wast | 50 +++++++++++ winch/codegen/src/isa/aarch64/masm.rs | 8 ++ winch/codegen/src/isa/x64/asm.rs | 84 +++++++++++++++---- winch/codegen/src/isa/x64/masm.rs | 30 +++++++ winch/codegen/src/isa/x64/regs.rs | 9 +- winch/codegen/src/masm.rs | 6 ++ winch/codegen/src/visitor.rs | 32 +++++++ .../filetests/x64/f32_abs/f32_abs_const.wat | 25 ++++++ .../filetests/x64/f32_abs/f32_abs_param.wat | 20 +++++ .../filetests/x64/f32_neg/f32_neg_const.wat | 25 ++++++ .../filetests/x64/f32_neg/f32_neg_param.wat | 20 +++++ .../filetests/x64/f64_abs/f64_abs_const.wat | 21 +++++ .../filetests/x64/f64_abs/f64_abs_param.wat | 21 +++++ .../filetests/x64/f64_neg/f64_neg_const.wat | 21 +++++ .../filetests/x64/f64_neg/f64_neg_param.wat | 21 +++++ 17 files changed, 431 insertions(+), 18 deletions(-) create mode 100644 tests/misc_testsuite/winch/f32_bitwise.wast create mode 100644 tests/misc_testsuite/winch/f64_bitwise.wast create mode 100644 winch/filetests/filetests/x64/f32_abs/f32_abs_const.wat create mode 100644 winch/filetests/filetests/x64/f32_abs/f32_abs_param.wat create mode 100644 winch/filetests/filetests/x64/f32_neg/f32_neg_const.wat create mode 100644 winch/filetests/filetests/x64/f32_neg/f32_neg_param.wat create mode 100644 winch/filetests/filetests/x64/f64_abs/f64_abs_const.wat create mode 100644 winch/filetests/filetests/x64/f64_abs/f64_abs_param.wat create mode 100644 winch/filetests/filetests/x64/f64_neg/f64_neg_const.wat create mode 100644 winch/filetests/filetests/x64/f64_neg/f64_neg_param.wat diff --git a/fuzz/fuzz_targets/differential.rs b/fuzz/fuzz_targets/differential.rs index 7d3c65f75483..9cbf7c608389 100644 --- a/fuzz/fuzz_targets/differential.rs +++ b/fuzz/fuzz_targets/differential.rs @@ -370,7 +370,11 @@ fn winch_supports_module(module: &[u8]) -> bool { | Unreachable { .. } | Return { .. } | F32Const { .. } - | F64Const { .. } => {} + | F64Const { .. } + | F32Abs { .. } + | F64Abs { .. } + | F32Neg { .. } + | F64Neg { .. } => {} _ => { supported = false; break 'main; diff --git a/tests/misc_testsuite/winch/f32_bitwise.wast b/tests/misc_testsuite/winch/f32_bitwise.wast new file mode 100644 index 000000000000..4816fceb43ea --- /dev/null +++ b/tests/misc_testsuite/winch/f32_bitwise.wast @@ -0,0 +1,50 @@ +;; Test all the f32 bitwise operators on major boundary values and all special +;; values. + +(module + (func (export "abs") (param $x f32) (result f32) (f32.abs (local.get $x))) + (func (export "neg") (param $x f32) (result f32) (f32.neg (local.get $x))) +) + +(assert_return (invoke "abs" (f32.const -0x0p+0)) (f32.const 0x0p+0)) +(assert_return (invoke "abs" (f32.const 0x0p+0)) (f32.const 0x0p+0)) +(assert_return (invoke "abs" (f32.const -0x1p-149)) (f32.const 0x1p-149)) +(assert_return (invoke "abs" (f32.const 0x1p-149)) (f32.const 0x1p-149)) +(assert_return (invoke "abs" (f32.const -0x1p-126)) (f32.const 0x1p-126)) +(assert_return (invoke "abs" (f32.const 0x1p-126)) (f32.const 0x1p-126)) +(assert_return (invoke "abs" (f32.const -0x1p-1)) (f32.const 0x1p-1)) +(assert_return (invoke "abs" (f32.const 0x1p-1)) (f32.const 0x1p-1)) +(assert_return (invoke "abs" (f32.const -0x1p+0)) (f32.const 0x1p+0)) +(assert_return (invoke "abs" (f32.const 0x1p+0)) (f32.const 0x1p+0)) +(assert_return (invoke "abs" (f32.const -0x1.921fb6p+2)) (f32.const 0x1.921fb6p+2)) +(assert_return (invoke "abs" (f32.const 0x1.921fb6p+2)) (f32.const 0x1.921fb6p+2)) +(assert_return (invoke "abs" (f32.const -0x1.fffffep+127)) (f32.const 0x1.fffffep+127)) +(assert_return (invoke "abs" (f32.const 0x1.fffffep+127)) (f32.const 0x1.fffffep+127)) +(assert_return (invoke "abs" (f32.const -inf)) (f32.const inf)) +(assert_return (invoke "abs" (f32.const inf)) (f32.const inf)) +(assert_return (invoke "abs" (f32.const -nan)) (f32.const nan)) +(assert_return (invoke "abs" (f32.const nan)) (f32.const nan)) +(assert_return (invoke "neg" (f32.const -0x0p+0)) (f32.const 0x0p+0)) +(assert_return (invoke "neg" (f32.const 0x0p+0)) (f32.const -0x0p+0)) +(assert_return (invoke "neg" (f32.const -0x1p-149)) (f32.const 0x1p-149)) +(assert_return (invoke "neg" (f32.const 0x1p-149)) (f32.const -0x1p-149)) +(assert_return (invoke "neg" (f32.const -0x1p-126)) (f32.const 0x1p-126)) +(assert_return (invoke "neg" (f32.const 0x1p-126)) (f32.const -0x1p-126)) +(assert_return (invoke "neg" (f32.const -0x1p-1)) (f32.const 0x1p-1)) +(assert_return (invoke "neg" (f32.const 0x1p-1)) (f32.const -0x1p-1)) +(assert_return (invoke "neg" (f32.const -0x1p+0)) (f32.const 0x1p+0)) +(assert_return (invoke "neg" (f32.const 0x1p+0)) (f32.const -0x1p+0)) +(assert_return (invoke "neg" (f32.const -0x1.921fb6p+2)) (f32.const 0x1.921fb6p+2)) +(assert_return (invoke "neg" (f32.const 0x1.921fb6p+2)) (f32.const -0x1.921fb6p+2)) +(assert_return (invoke "neg" (f32.const -0x1.fffffep+127)) (f32.const 0x1.fffffep+127)) +(assert_return (invoke "neg" (f32.const 0x1.fffffep+127)) (f32.const -0x1.fffffep+127)) +(assert_return (invoke "neg" (f32.const -inf)) (f32.const inf)) +(assert_return (invoke "neg" (f32.const inf)) (f32.const -inf)) +(assert_return (invoke "neg" (f32.const -nan)) (f32.const nan)) +(assert_return (invoke "neg" (f32.const nan)) (f32.const -nan)) + + +;; Type check + +(assert_invalid (module (func (result f32) (f32.abs (i64.const 0)))) "type mismatch") +(assert_invalid (module (func (result f32) (f32.neg (i64.const 0)))) "type mismatch") diff --git a/tests/misc_testsuite/winch/f64_bitwise.wast b/tests/misc_testsuite/winch/f64_bitwise.wast new file mode 100644 index 000000000000..d6b85d0fce84 --- /dev/null +++ b/tests/misc_testsuite/winch/f64_bitwise.wast @@ -0,0 +1,50 @@ +;; Test all the f64 bitwise operators on major boundary values and all special +;; values. + +(module + (func (export "abs") (param $x f64) (result f64) (f64.abs (local.get $x))) + (func (export "neg") (param $x f64) (result f64) (f64.neg (local.get $x))) +) + +(assert_return (invoke "abs" (f64.const -0x0p+0)) (f64.const 0x0p+0)) +(assert_return (invoke "abs" (f64.const 0x0p+0)) (f64.const 0x0p+0)) +(assert_return (invoke "abs" (f64.const -0x0.0000000000001p-1022)) (f64.const 0x0.0000000000001p-1022)) +(assert_return (invoke "abs" (f64.const 0x0.0000000000001p-1022)) (f64.const 0x0.0000000000001p-1022)) +(assert_return (invoke "abs" (f64.const -0x1p-1022)) (f64.const 0x1p-1022)) +(assert_return (invoke "abs" (f64.const 0x1p-1022)) (f64.const 0x1p-1022)) +(assert_return (invoke "abs" (f64.const -0x1p-1)) (f64.const 0x1p-1)) +(assert_return (invoke "abs" (f64.const 0x1p-1)) (f64.const 0x1p-1)) +(assert_return (invoke "abs" (f64.const -0x1p+0)) (f64.const 0x1p+0)) +(assert_return (invoke "abs" (f64.const 0x1p+0)) (f64.const 0x1p+0)) +(assert_return (invoke "abs" (f64.const -0x1.921fb54442d18p+2)) (f64.const 0x1.921fb54442d18p+2)) +(assert_return (invoke "abs" (f64.const 0x1.921fb54442d18p+2)) (f64.const 0x1.921fb54442d18p+2)) +(assert_return (invoke "abs" (f64.const -0x1.fffffffffffffp+1023)) (f64.const 0x1.fffffffffffffp+1023)) +(assert_return (invoke "abs" (f64.const 0x1.fffffffffffffp+1023)) (f64.const 0x1.fffffffffffffp+1023)) +(assert_return (invoke "abs" (f64.const -inf)) (f64.const inf)) +(assert_return (invoke "abs" (f64.const inf)) (f64.const inf)) +(assert_return (invoke "abs" (f64.const -nan)) (f64.const nan)) +(assert_return (invoke "abs" (f64.const nan)) (f64.const nan)) +(assert_return (invoke "neg" (f64.const -0x0p+0)) (f64.const 0x0p+0)) +(assert_return (invoke "neg" (f64.const 0x0p+0)) (f64.const -0x0p+0)) +(assert_return (invoke "neg" (f64.const -0x0.0000000000001p-1022)) (f64.const 0x0.0000000000001p-1022)) +(assert_return (invoke "neg" (f64.const 0x0.0000000000001p-1022)) (f64.const -0x0.0000000000001p-1022)) +(assert_return (invoke "neg" (f64.const -0x1p-1022)) (f64.const 0x1p-1022)) +(assert_return (invoke "neg" (f64.const 0x1p-1022)) (f64.const -0x1p-1022)) +(assert_return (invoke "neg" (f64.const -0x1p-1)) (f64.const 0x1p-1)) +(assert_return (invoke "neg" (f64.const 0x1p-1)) (f64.const -0x1p-1)) +(assert_return (invoke "neg" (f64.const -0x1p+0)) (f64.const 0x1p+0)) +(assert_return (invoke "neg" (f64.const 0x1p+0)) (f64.const -0x1p+0)) +(assert_return (invoke "neg" (f64.const -0x1.921fb54442d18p+2)) (f64.const 0x1.921fb54442d18p+2)) +(assert_return (invoke "neg" (f64.const 0x1.921fb54442d18p+2)) (f64.const -0x1.921fb54442d18p+2)) +(assert_return (invoke "neg" (f64.const -0x1.fffffffffffffp+1023)) (f64.const 0x1.fffffffffffffp+1023)) +(assert_return (invoke "neg" (f64.const 0x1.fffffffffffffp+1023)) (f64.const -0x1.fffffffffffffp+1023)) +(assert_return (invoke "neg" (f64.const -inf)) (f64.const inf)) +(assert_return (invoke "neg" (f64.const inf)) (f64.const -inf)) +(assert_return (invoke "neg" (f64.const -nan)) (f64.const nan)) +(assert_return (invoke "neg" (f64.const nan)) (f64.const -nan)) + + +;; Type check + +(assert_invalid (module (func (result f64) (f64.abs (i64.const 0)))) "type mismatch") +(assert_invalid (module (func (result f64) (f64.neg (i64.const 0)))) "type mismatch") diff --git a/winch/codegen/src/isa/aarch64/masm.rs b/winch/codegen/src/isa/aarch64/masm.rs index 1468666f3c11..76f8157db18e 100644 --- a/winch/codegen/src/isa/aarch64/masm.rs +++ b/winch/codegen/src/isa/aarch64/masm.rs @@ -228,6 +228,14 @@ impl Masm for MacroAssembler { } } + fn float_neg(&mut self, _dst: Reg, _src: RegImm, _size: OperandSize) { + todo!() + } + + fn float_abs(&mut self, _dst: Reg, _src: RegImm, _size: OperandSize) { + todo!() + } + fn and(&mut self, _dst: RegImm, _lhs: RegImm, _rhs: RegImm, _size: OperandSize) { todo!() } diff --git a/winch/codegen/src/isa/x64/asm.rs b/winch/codegen/src/isa/x64/asm.rs index 1fe9a292b568..9c9a300f9ee7 100644 --- a/winch/codegen/src/isa/x64/asm.rs +++ b/winch/codegen/src/isa/x64/asm.rs @@ -1,7 +1,7 @@ //! Assembler library implementation for x64. use crate::{ - isa::reg::Reg, + isa::reg::{Reg, RegClass}, masm::{CalleeKind, CmpKind, DivKind, OperandSize, RemKind, ShiftKind}, }; use cranelift_codegen::{ @@ -410,18 +410,36 @@ impl Assembler { /// "and" two registers. pub fn and_rr(&mut self, src: Reg, dst: Reg, size: OperandSize) { - self.emit(Inst::AluRmiR { - size: size.into(), - op: AluRmiROpcode::And, - src1: dst.into(), - src2: src.into(), - dst: dst.into(), - }); + match dst.class() { + RegClass::Int => { + self.emit(Inst::AluRmiR { + size: size.into(), + op: AluRmiROpcode::And, + src1: dst.into(), + src2: src.into(), + dst: dst.into(), + }); + } + RegClass::Float => { + let op = match size { + OperandSize::S32 => SseOpcode::Andps, + OperandSize::S64 => SseOpcode::Andpd, + OperandSize::S128 => unreachable!(), + }; + + self.emit(Inst::XmmRmR { + op, + src1: dst.into(), + src2: XmmMemAligned::from(Xmm::from(src)), + dst: dst.into(), + }); + } + RegClass::Vector => unreachable!(), + } } pub fn and_ir(&mut self, imm: i32, dst: Reg, size: OperandSize) { let imm = RegMemImm::imm(imm as u32); - self.emit(Inst::AluRmiR { size: size.into(), op: AluRmiROpcode::And, @@ -431,6 +449,21 @@ impl Assembler { }); } + pub fn gpr_to_xmm(&mut self, src: Reg, dst: Reg, size: OperandSize) { + let op = match size { + OperandSize::S32 => SseOpcode::Movd, + OperandSize::S64 => SseOpcode::Movq, + OperandSize::S128 => unreachable!(), + }; + + self.emit(Inst::GprToXmm { + op, + src: src.into(), + dst: dst.into(), + src_size: size.into(), + }) + } + pub fn or_rr(&mut self, src: Reg, dst: Reg, size: OperandSize) { self.emit(Inst::AluRmiR { size: size.into(), @@ -455,13 +488,32 @@ impl Assembler { /// Logical exclusive or with registers. pub fn xor_rr(&mut self, src: Reg, dst: Reg, size: OperandSize) { - self.emit(Inst::AluRmiR { - size: size.into(), - op: AluRmiROpcode::Xor, - src1: dst.into(), - src2: src.into(), - dst: dst.into(), - }); + match dst.class() { + RegClass::Int => { + self.emit(Inst::AluRmiR { + size: size.into(), + op: AluRmiROpcode::Xor, + src1: dst.into(), + src2: src.into(), + dst: dst.into(), + }); + } + RegClass::Float => { + let op = match size { + OperandSize::S32 => SseOpcode::Xorps, + OperandSize::S64 => SseOpcode::Xorpd, + OperandSize::S128 => unreachable!(), + }; + + self.emit(Inst::XmmRmR { + op, + src1: dst.into(), + src2: XmmMemAligned::from(Xmm::from(src)), + dst: dst.into(), + }); + } + RegClass::Vector => todo!(), + } } pub fn xor_ir(&mut self, imm: i32, dst: Reg, size: OperandSize) { diff --git a/winch/codegen/src/isa/x64/masm.rs b/winch/codegen/src/isa/x64/masm.rs index 55a61ed9a0da..af39b793ed8c 100644 --- a/winch/codegen/src/isa/x64/masm.rs +++ b/winch/codegen/src/isa/x64/masm.rs @@ -265,6 +265,36 @@ impl Masm for MacroAssembler { } } + fn float_neg(&mut self, dst: Reg, src: RegImm, size: OperandSize) { + Self::ensure_two_argument_form(&dst.into(), &src); + assert_eq!(dst.class(), RegClass::Float); + let mask = match size { + OperandSize::S32 => I::I32(0x80000000), + OperandSize::S64 => I::I64(0x8000000000000000), + OperandSize::S128 => unreachable!(), + }; + let scratch_gpr = regs::scratch(); + self.load_constant(&mask, scratch_gpr, size); + let scratch_xmm = regs::scratch_xmm(); + self.asm.gpr_to_xmm(scratch_gpr, scratch_xmm, size); + self.asm.xor_rr(scratch_xmm, dst, size); + } + + fn float_abs(&mut self, dst: Reg, src: RegImm, size: OperandSize) { + Self::ensure_two_argument_form(&dst.into(), &src); + assert_eq!(dst.class(), RegClass::Float); + let mask = match size { + OperandSize::S32 => I::I32(0x7fffffff), + OperandSize::S64 => I::I64(0x7fffffffffffffff), + OperandSize::S128 => unreachable!(), + }; + let scratch_gpr = regs::scratch(); + self.load_constant(&mask, scratch_gpr, size); + let scratch_xmm = regs::scratch_xmm(); + self.asm.gpr_to_xmm(scratch_gpr, scratch_xmm, size); + self.asm.and_rr(scratch_xmm, dst, size); + } + fn and(&mut self, dst: RegImm, lhs: RegImm, rhs: RegImm, size: OperandSize) { Self::ensure_two_argument_form(&dst, &lhs); match (rhs, dst) { diff --git a/winch/codegen/src/isa/x64/regs.rs b/winch/codegen/src/isa/x64/regs.rs index 02a84503b46c..0724b1d67252 100644 --- a/winch/codegen/src/isa/x64/regs.rs +++ b/winch/codegen/src/isa/x64/regs.rs @@ -166,6 +166,10 @@ pub(crate) fn xmm15() -> Reg { fpr(15) } +pub(crate) fn scratch_xmm() -> Reg { + xmm15() +} + const GPR: u32 = 16; const FPR: u32 = 16; const ALLOCATABLE_GPR: u32 = (1 << GPR) - 1; @@ -174,12 +178,15 @@ const ALLOCATABLE_FPR: u32 = (1 << FPR) - 1; // R14: Is a pinned register, used as the instance register. const NON_ALLOCATABLE_GPR: u32 = (1 << ENC_RBP) | (1 << ENC_RSP) | (1 << ENC_R11) | (1 << ENC_R14); +// xmm15: Is used as the scratch register. +const NON_ALLOCATABLE_FPR: u32 = 1 << 15; + /// Bitmask to represent the available general purpose registers. pub(crate) const ALL_GPR: u32 = ALLOCATABLE_GPR & !NON_ALLOCATABLE_GPR; /// Bitmask to represent the available floating point registers. // Note: at the time of writing all floating point registers are allocatable, // but we might need a scratch register in the future. -pub(crate) const ALL_FPR: u32 = ALLOCATABLE_FPR; +pub(crate) const ALL_FPR: u32 = ALLOCATABLE_FPR & !NON_ALLOCATABLE_FPR; /// Returns the callee-saved registers according to a particular calling /// convention. diff --git a/winch/codegen/src/masm.rs b/winch/codegen/src/masm.rs index c7bdb1111e5d..0ad22b579cae 100644 --- a/winch/codegen/src/masm.rs +++ b/winch/codegen/src/masm.rs @@ -317,6 +317,12 @@ pub(crate) trait MacroAssembler { /// Perform multiplication operation. fn mul(&mut self, dst: RegImm, lhs: RegImm, rhs: RegImm, size: OperandSize); + /// Perform a floating point abs operation. + fn float_abs(&mut self, dst: Reg, src: RegImm, size: OperandSize); + + /// Perform a floating point negation operation. + fn float_neg(&mut self, dst: Reg, src: RegImm, size: OperandSize); + /// Perform logical and operation. fn and(&mut self, dst: RegImm, lhs: RegImm, rhs: RegImm, size: OperandSize); diff --git a/winch/codegen/src/visitor.rs b/winch/codegen/src/visitor.rs index bb509e3e5cfb..097a2279a7f2 100644 --- a/winch/codegen/src/visitor.rs +++ b/winch/codegen/src/visitor.rs @@ -39,6 +39,10 @@ macro_rules! def_unsupported { (emit I64Const $($rest:tt)*) => {}; (emit F32Const $($rest:tt)*) => {}; (emit F64Const $($rest:tt)*) => {}; + (emit F32Abs $($rest:tt)*) => {}; + (emit F64Abs $($rest:tt)*) => {}; + (emit F32Neg $($rest:tt)*) => {}; + (emit F64Neg $($rest:tt)*) => {}; (emit I32Add $($rest:tt)*) => {}; (emit I64Add $($rest:tt)*) => {}; (emit I32Sub $($rest:tt)*) => {}; @@ -142,6 +146,34 @@ where self.context.stack.push(Val::f64(val)); } + fn visit_f32_abs(&mut self) { + self.context + .unop(self.masm, OperandSize::S32, &mut |masm, reg, size| { + masm.float_abs(reg, RegImm::Reg(reg), size); + }); + } + + fn visit_f64_abs(&mut self) { + self.context + .unop(self.masm, OperandSize::S64, &mut |masm, reg, size| { + masm.float_abs(reg, RegImm::Reg(reg), size); + }); + } + + fn visit_f32_neg(&mut self) { + self.context + .unop(self.masm, OperandSize::S32, &mut |masm, reg, size| { + masm.float_neg(reg, RegImm::Reg(reg), size); + }); + } + + fn visit_f64_neg(&mut self) { + self.context + .unop(self.masm, OperandSize::S64, &mut |masm, reg, size| { + masm.float_neg(reg, RegImm::Reg(reg), size); + }); + } + fn visit_i32_add(&mut self) { self.context.i32_binop(self.masm, |masm, dst, src, size| { masm.add(dst, dst, src, size); diff --git a/winch/filetests/filetests/x64/f32_abs/f32_abs_const.wat b/winch/filetests/filetests/x64/f32_abs/f32_abs_const.wat new file mode 100644 index 000000000000..7500e7a462b2 --- /dev/null +++ b/winch/filetests/filetests/x64/f32_abs/f32_abs_const.wat @@ -0,0 +1,25 @@ +;;! target = "x86_64" + +(module + (func (result f32) + (f32.const -1.32) + (f32.abs) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec08 sub rsp, 8 +;; 8: 4c893424 mov qword ptr [rsp], r14 +;; c: f30f10051c000000 movss xmm0, dword ptr [rip + 0x1c] +;; 14: 41bbffffff7f mov r11d, 0x7fffffff +;; 1a: 66450f6efb movd xmm15, r11d +;; 1f: 410f54c7 andps xmm0, xmm15 +;; 23: 4883c408 add rsp, 8 +;; 27: 5d pop rbp +;; 28: c3 ret +;; 29: 0000 add byte ptr [rax], al +;; 2b: 0000 add byte ptr [rax], al +;; 2d: 0000 add byte ptr [rax], al +;; 2f: 00c3 add bl, al +;; 31: f5 cmc +;; 32: a8bf test al, 0xbf diff --git a/winch/filetests/filetests/x64/f32_abs/f32_abs_param.wat b/winch/filetests/filetests/x64/f32_abs/f32_abs_param.wat new file mode 100644 index 000000000000..b2cc9206af77 --- /dev/null +++ b/winch/filetests/filetests/x64/f32_abs/f32_abs_param.wat @@ -0,0 +1,20 @@ +;;! target = "x86_64" + +(module + (func (param f32) (result f32) + (local.get 0) + (f32.abs) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec10 sub rsp, 0x10 +;; 8: f30f1144240c movss dword ptr [rsp + 0xc], xmm0 +;; e: 4c89742404 mov qword ptr [rsp + 4], r14 +;; 13: f30f1044240c movss xmm0, dword ptr [rsp + 0xc] +;; 19: 41bbffffff7f mov r11d, 0x7fffffff +;; 1f: 66450f6efb movd xmm15, r11d +;; 24: 410f54c7 andps xmm0, xmm15 +;; 28: 4883c410 add rsp, 0x10 +;; 2c: 5d pop rbp +;; 2d: c3 ret diff --git a/winch/filetests/filetests/x64/f32_neg/f32_neg_const.wat b/winch/filetests/filetests/x64/f32_neg/f32_neg_const.wat new file mode 100644 index 000000000000..07d16db3a524 --- /dev/null +++ b/winch/filetests/filetests/x64/f32_neg/f32_neg_const.wat @@ -0,0 +1,25 @@ +;;! target = "x86_64" + +(module + (func (result f32) + (f32.const -1.32) + (f32.neg) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec08 sub rsp, 8 +;; 8: 4c893424 mov qword ptr [rsp], r14 +;; c: f30f10051c000000 movss xmm0, dword ptr [rip + 0x1c] +;; 14: 41bb00000080 mov r11d, 0x80000000 +;; 1a: 66450f6efb movd xmm15, r11d +;; 1f: 410f57c7 xorps xmm0, xmm15 +;; 23: 4883c408 add rsp, 8 +;; 27: 5d pop rbp +;; 28: c3 ret +;; 29: 0000 add byte ptr [rax], al +;; 2b: 0000 add byte ptr [rax], al +;; 2d: 0000 add byte ptr [rax], al +;; 2f: 00c3 add bl, al +;; 31: f5 cmc +;; 32: a8bf test al, 0xbf diff --git a/winch/filetests/filetests/x64/f32_neg/f32_neg_param.wat b/winch/filetests/filetests/x64/f32_neg/f32_neg_param.wat new file mode 100644 index 000000000000..3f39aa88f8c1 --- /dev/null +++ b/winch/filetests/filetests/x64/f32_neg/f32_neg_param.wat @@ -0,0 +1,20 @@ +;;! target = "x86_64" + +(module + (func (param f32) (result f32) + (local.get 0) + (f32.neg) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec10 sub rsp, 0x10 +;; 8: f30f1144240c movss dword ptr [rsp + 0xc], xmm0 +;; e: 4c89742404 mov qword ptr [rsp + 4], r14 +;; 13: f30f1044240c movss xmm0, dword ptr [rsp + 0xc] +;; 19: 41bb00000080 mov r11d, 0x80000000 +;; 1f: 66450f6efb movd xmm15, r11d +;; 24: 410f57c7 xorps xmm0, xmm15 +;; 28: 4883c410 add rsp, 0x10 +;; 2c: 5d pop rbp +;; 2d: c3 ret diff --git a/winch/filetests/filetests/x64/f64_abs/f64_abs_const.wat b/winch/filetests/filetests/x64/f64_abs/f64_abs_const.wat new file mode 100644 index 000000000000..991651614d09 --- /dev/null +++ b/winch/filetests/filetests/x64/f64_abs/f64_abs_const.wat @@ -0,0 +1,21 @@ +;;! target = "x86_64" + +(module + (func (result f64) + (f64.const -1.32) + (f64.abs) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec08 sub rsp, 8 +;; 8: 4c893424 mov qword ptr [rsp], r14 +;; c: f20f10051c000000 movsd xmm0, qword ptr [rip + 0x1c] +;; 14: 49bbffffffffffffff7f +;; movabs r11, 0x7fffffffffffffff +;; 1e: 664d0f6efb movq xmm15, r11 +;; 23: 66410f54c7 andpd xmm0, xmm15 +;; 28: 4883c408 add rsp, 8 +;; 2c: 5d pop rbp +;; 2d: c3 ret +;; 2e: 0000 add byte ptr [rax], al diff --git a/winch/filetests/filetests/x64/f64_abs/f64_abs_param.wat b/winch/filetests/filetests/x64/f64_abs/f64_abs_param.wat new file mode 100644 index 000000000000..117a045de256 --- /dev/null +++ b/winch/filetests/filetests/x64/f64_abs/f64_abs_param.wat @@ -0,0 +1,21 @@ +;;! target = "x86_64" + +(module + (func (param f64) (result f64) + (local.get 0) + (f64.abs) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec10 sub rsp, 0x10 +;; 8: f20f11442408 movsd qword ptr [rsp + 8], xmm0 +;; e: 4c893424 mov qword ptr [rsp], r14 +;; 12: f20f10442408 movsd xmm0, qword ptr [rsp + 8] +;; 18: 49bbffffffffffffff7f +;; movabs r11, 0x7fffffffffffffff +;; 22: 664d0f6efb movq xmm15, r11 +;; 27: 66410f54c7 andpd xmm0, xmm15 +;; 2c: 4883c410 add rsp, 0x10 +;; 30: 5d pop rbp +;; 31: c3 ret diff --git a/winch/filetests/filetests/x64/f64_neg/f64_neg_const.wat b/winch/filetests/filetests/x64/f64_neg/f64_neg_const.wat new file mode 100644 index 000000000000..41046ab377bc --- /dev/null +++ b/winch/filetests/filetests/x64/f64_neg/f64_neg_const.wat @@ -0,0 +1,21 @@ +;;! target = "x86_64" + +(module + (func (result f64) + (f64.const -1.32) + (f64.neg) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec08 sub rsp, 8 +;; 8: 4c893424 mov qword ptr [rsp], r14 +;; c: f20f10051c000000 movsd xmm0, qword ptr [rip + 0x1c] +;; 14: 49bb0000000000000080 +;; movabs r11, 0x8000000000000000 +;; 1e: 664d0f6efb movq xmm15, r11 +;; 23: 66410f57c7 xorpd xmm0, xmm15 +;; 28: 4883c408 add rsp, 8 +;; 2c: 5d pop rbp +;; 2d: c3 ret +;; 2e: 0000 add byte ptr [rax], al diff --git a/winch/filetests/filetests/x64/f64_neg/f64_neg_param.wat b/winch/filetests/filetests/x64/f64_neg/f64_neg_param.wat new file mode 100644 index 000000000000..96c9b0fdce85 --- /dev/null +++ b/winch/filetests/filetests/x64/f64_neg/f64_neg_param.wat @@ -0,0 +1,21 @@ +;;! target = "x86_64" + +(module + (func (param f64) (result f64) + (local.get 0) + (f64.neg) + ) +) +;; 0: 55 push rbp +;; 1: 4889e5 mov rbp, rsp +;; 4: 4883ec10 sub rsp, 0x10 +;; 8: f20f11442408 movsd qword ptr [rsp + 8], xmm0 +;; e: 4c893424 mov qword ptr [rsp], r14 +;; 12: f20f10442408 movsd xmm0, qword ptr [rsp + 8] +;; 18: 49bb0000000000000080 +;; movabs r11, 0x8000000000000000 +;; 22: 664d0f6efb movq xmm15, r11 +;; 27: 66410f57c7 xorpd xmm0, xmm15 +;; 2c: 4883c410 add rsp, 0x10 +;; 30: 5d pop rbp +;; 31: c3 ret From 2186668f5287062090914574f2742ccde547c965 Mon Sep 17 00:00:00 2001 From: Michael Chesser Date: Mon, 11 Sep 2023 23:23:20 +0930 Subject: [PATCH 05/14] Cranelift: Improve codegen of store_imm on x64 (#6979) * Improve lowering of store_imm on x64 Adds a new x64 rule for directly lowering stores of immediates with a MOV instruction. * Ensure that the MovImmM operand fits in an i32 and add tests. * Update winch to handle MovImmM change --- cranelift/codegen/src/isa/x64/inst.isle | 7 +- cranelift/codegen/src/isa/x64/inst/emit.rs | 12 +- .../codegen/src/isa/x64/inst/emit_tests.rs | 24 +-- cranelift/codegen/src/isa/x64/inst/mod.rs | 10 +- cranelift/codegen/src/isa/x64/lower.isle | 5 + .../filetests/isa/x64/store-imm.clif | 201 ++++++++++++++++++ .../filetests/isa/x64/struct-ret.clif | 6 +- winch/codegen/src/isa/x64/asm.rs | 4 +- winch/codegen/src/isa/x64/masm.rs | 9 +- 9 files changed, 243 insertions(+), 35 deletions(-) create mode 100644 cranelift/filetests/filetests/isa/x64/store-imm.clif diff --git a/cranelift/codegen/src/isa/x64/inst.isle b/cranelift/codegen/src/isa/x64/inst.isle index e6a7fc38413c..b772a17d96bc 100644 --- a/cranelift/codegen/src/isa/x64/inst.isle +++ b/cranelift/codegen/src/isa/x64/inst.isle @@ -171,7 +171,7 @@ ;; Immediate store. (MovImmM (size OperandSize) - (simm64 u64) + (simm32 i32) (dst SyntheticAmode)) ;; Integer stores: mov (b w l q) reg addr. @@ -2368,6 +2368,11 @@ (let ((size OperandSize (raw_operand_size_of_type ty))) (SideEffectNoResult.Inst (MInst.MovRM size data addr)))) +(decl x64_movimm_m (Type SyntheticAmode i32) SideEffectNoResult) +(rule (x64_movimm_m ty addr imm) + (let ((size OperandSize (raw_operand_size_of_type ty))) + (SideEffectNoResult.Inst (MInst.MovImmM size imm addr)))) + (decl xmm_movrm (SseOpcode SyntheticAmode Xmm) SideEffectNoResult) (rule (xmm_movrm op addr data) (SideEffectNoResult.Inst (MInst.XmmMovRM op data addr))) diff --git a/cranelift/codegen/src/isa/x64/inst/emit.rs b/cranelift/codegen/src/isa/x64/inst/emit.rs index 9360fc039566..652d60b2afad 100644 --- a/cranelift/codegen/src/isa/x64/inst/emit.rs +++ b/cranelift/codegen/src/isa/x64/inst/emit.rs @@ -797,7 +797,7 @@ pub(crate) fn emit( } } - Inst::MovImmM { size, simm64, dst } => { + Inst::MovImmM { size, simm32, dst } => { let dst = &dst.finalize(state, sink).with_allocs(allocs); let default_rex = RexFlags::clear_w(); let default_opcode = 0xC7; @@ -810,13 +810,7 @@ pub(crate) fn emit( // operand is a memory operand, not a possibly 8-bit register. OperandSize::Size8 => (0xC6, default_rex, bytes, prefix), OperandSize::Size16 => (0xC7, default_rex, bytes, LegacyPrefixes::_66), - OperandSize::Size64 => { - if !low32_will_sign_extend_to_64(*simm64) { - panic!("Immediate-to-memory moves require immediate operand to sign-extend to 64 bits."); - } - - (default_opcode, RexFlags::from(*size), bytes, prefix) - } + OperandSize::Size64 => (default_opcode, RexFlags::from(*size), bytes, prefix), _ => (default_opcode, default_rex, bytes, prefix), }; @@ -826,7 +820,7 @@ pub(crate) fn emit( // 32-bit C7 /0 id // 64-bit REX.W C7 /0 id emit_std_enc_mem(sink, prefix, opcode, 1, /*subopcode*/ 0, dst, rex, 0); - emit_simm(sink, size, *simm64 as u32); + emit_simm(sink, size, *simm32 as u32); } Inst::MovRR { size, src, dst } => { diff --git a/cranelift/codegen/src/isa/x64/inst/emit_tests.rs b/cranelift/codegen/src/isa/x64/inst/emit_tests.rs index 9aff9d2f0555..cc8f8e318023 100644 --- a/cranelift/codegen/src/isa/x64/inst/emit_tests.rs +++ b/cranelift/codegen/src/isa/x64/inst/emit_tests.rs @@ -2856,7 +2856,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size8, - simm64: i8::MIN as u64, + simm32: i8::MIN as i32, dst: Amode::imm_reg(99, rax).into(), }, "C6406380", @@ -2866,7 +2866,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size8, - simm64: i8::MAX as u64, + simm32: i8::MAX as i32, dst: Amode::imm_reg(99, r8).into(), }, "41C640637F", @@ -2876,7 +2876,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size16, - simm64: i16::MIN as u64, + simm32: i16::MIN as i32, dst: Amode::imm_reg(99, rcx).into(), }, "66C741630080", @@ -2886,7 +2886,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size16, - simm64: i16::MAX as u64, + simm32: i16::MAX as i32, dst: Amode::imm_reg(99, r9).into(), }, "6641C74163FF7F", @@ -2896,7 +2896,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size32, - simm64: i32::MIN as u64, + simm32: i32::MIN, dst: Amode::imm_reg(99, rdx).into(), }, "C7426300000080", @@ -2906,7 +2906,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size32, - simm64: i32::MAX as u64, + simm32: i32::MAX, dst: Amode::imm_reg(99, r10).into(), }, "41C74263FFFFFF7F", @@ -2916,7 +2916,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size64, - simm64: i32::MIN as u64, + simm32: i32::MIN, dst: Amode::imm_reg(99, rbx).into(), }, "48C7436300000080", @@ -2926,7 +2926,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size64, - simm64: i32::MAX as u64, + simm32: i32::MAX, dst: Amode::imm_reg(99, r11).into(), }, "49C74363FFFFFF7F", @@ -2936,7 +2936,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size8, - simm64: 0u64, + simm32: 0i32, dst: Amode::imm_reg(99, rsp).into(), }, "C644246300", @@ -2946,7 +2946,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size16, - simm64: 0u64, + simm32: 0i32, dst: Amode::imm_reg(99, r12).into(), }, "6641C74424630000", @@ -2956,7 +2956,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size32, - simm64: 0u64, + simm32: 0i32, dst: Amode::imm_reg(99, rbp).into(), }, "C7456300000000", @@ -2966,7 +2966,7 @@ fn test_x64_emit() { insns.push(( Inst::MovImmM { size: OperandSize::Size64, - simm64: 0u64, + simm32: 0i32, dst: Amode::imm_reg(99, r13).into(), }, "49C7456300000000", diff --git a/cranelift/codegen/src/isa/x64/inst/mod.rs b/cranelift/codegen/src/isa/x64/inst/mod.rs index c09df5fc2d3f..07590a5bc612 100644 --- a/cranelift/codegen/src/isa/x64/inst/mod.rs +++ b/cranelift/codegen/src/isa/x64/inst/mod.rs @@ -1369,14 +1369,14 @@ impl PrettyPrint for Inst { } } - Inst::MovImmM { size, simm64, dst } => { + Inst::MovImmM { size, simm32, dst } => { let dst = dst.pretty_print(size.to_bytes(), allocs); let suffix = suffix_bwlq(*size); let imm = match *size { - OperandSize::Size8 => ((*simm64 as u8) as i8).to_string(), - OperandSize::Size16 => ((*simm64 as u16) as i16).to_string(), - OperandSize::Size32 => ((*simm64 as u32) as i32).to_string(), - OperandSize::Size64 => (*simm64 as i64).to_string(), + OperandSize::Size8 => ((*simm32 as u8) as i8).to_string(), + OperandSize::Size16 => ((*simm32 as u16) as i16).to_string(), + OperandSize::Size32 => simm32.to_string(), + OperandSize::Size64 => (*simm32 as i64).to_string(), }; let op = ljustify2("mov".to_string(), suffix); format!("{op} ${imm}, {dst}") diff --git a/cranelift/codegen/src/isa/x64/lower.isle b/cranelift/codegen/src/isa/x64/lower.isle index bf391a50ec3f..7820cdb2b518 100644 --- a/cranelift/codegen/src/isa/x64/lower.isle +++ b/cranelift/codegen/src/isa/x64/lower.isle @@ -2912,6 +2912,11 @@ (side_effect (x64_movrm $I32 (to_amode flags address offset) value))) +;; IMM stores +(rule 2 (lower (store flags (has_type (fits_in_64 ty) (iconst (simm32 value))) address offset)) + (side_effect + (x64_movimm_m ty (to_amode flags address offset) value))) + ;; F32 stores of values in XMM registers. (rule 1 (lower (store flags value @ (value_type $F32) diff --git a/cranelift/filetests/filetests/isa/x64/store-imm.clif b/cranelift/filetests/filetests/isa/x64/store-imm.clif new file mode 100644 index 000000000000..e0c774b4e418 --- /dev/null +++ b/cranelift/filetests/filetests/isa/x64/store-imm.clif @@ -0,0 +1,201 @@ +test compile precise-output +target x86_64 + +function %store_imm8(i64 sret) { +block0(v0: i64): + v1 = iconst.i8 0x12 + store v1, v0 + return +} + +; VCode: +; pushq %rbp +; movq %rsp, %rbp +; block0: +; movb $18, 0(%rdi) +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; ret +; +; Disassembled: +; block0: ; offset 0x0 +; pushq %rbp +; movq %rsp, %rbp +; block1: ; offset 0x4 +; movb $0x12, (%rdi) ; trap: heap_oob +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; retq + +function %store_imm16(i64 sret) { +block0(v0: i64): + v1 = iconst.i16 0x1234 + store v1, v0 + return +} + +; VCode: +; pushq %rbp +; movq %rsp, %rbp +; block0: +; movw $4660, 0(%rdi) +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; ret +; +; Disassembled: +; block0: ; offset 0x0 +; pushq %rbp +; movq %rsp, %rbp +; block1: ; offset 0x4 +; movw $0x1234, (%rdi) ; trap: heap_oob +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; retq + +function %store_imm32(i64 sret) { +block0(v0: i64): + v1 = iconst.i32 0x1234_5678 + store v1, v0 + return +} + +; VCode: +; pushq %rbp +; movq %rsp, %rbp +; block0: +; movl $305419896, 0(%rdi) +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; ret +; +; Disassembled: +; block0: ; offset 0x0 +; pushq %rbp +; movq %rsp, %rbp +; block1: ; offset 0x4 +; movl $0x12345678, (%rdi) ; trap: heap_oob +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; retq + +function %store_imm64(i64 sret) { +block0(v0: i64): + v1 = iconst.i64 0x1234_5678 + store v1, v0 + return +} + +; VCode: +; pushq %rbp +; movq %rsp, %rbp +; block0: +; movq $305419896, 0(%rdi) +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; ret +; +; Disassembled: +; block0: ; offset 0x0 +; pushq %rbp +; movq %rsp, %rbp +; block1: ; offset 0x4 +; movq $0x12345678, (%rdi) ; trap: heap_oob +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; retq + +function %store_max_i32_imm64(i64 sret) { +block0(v0: i64): + v1 = iconst.i64 0x7fff_ffff + store v1, v0 + return +} + +; VCode: +; pushq %rbp +; movq %rsp, %rbp +; block0: +; movq $2147483647, 0(%rdi) +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; ret +; +; Disassembled: +; block0: ; offset 0x0 +; pushq %rbp +; movq %rsp, %rbp +; block1: ; offset 0x4 +; movq $0x7fffffff, (%rdi) ; trap: heap_oob +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; retq + +function %store_min_i32_imm64(i64 sret) { +block0(v0: i64): + v1 = iconst.i64 -2_147_483_648 + store v1, v0 + return +} + +; VCode: +; pushq %rbp +; movq %rsp, %rbp +; block0: +; movq $-2147483648, 0(%rdi) +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; ret +; +; Disassembled: +; block0: ; offset 0x0 +; pushq %rbp +; movq %rsp, %rbp +; block1: ; offset 0x4 +; movq $18446744071562067968, (%rdi) ; trap: heap_oob +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; retq + +function %store_max_i64_imm64(i64 sret) { +block0(v0: i64): + v1 = iconst.i64 0x7fff_ffff_ffff_ffff + store v1, v0 + return +} + +; VCode: +; pushq %rbp +; movq %rsp, %rbp +; block0: +; movabsq $9223372036854775807, %rax +; movq %rax, 0(%rdi) +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; ret +; +; Disassembled: +; block0: ; offset 0x0 +; pushq %rbp +; movq %rsp, %rbp +; block1: ; offset 0x4 +; movabsq $0x7fffffffffffffff, %rax +; movq %rax, (%rdi) ; trap: heap_oob +; movq %rdi, %rax +; movq %rbp, %rsp +; popq %rbp +; retq + diff --git a/cranelift/filetests/filetests/isa/x64/struct-ret.clif b/cranelift/filetests/filetests/isa/x64/struct-ret.clif index ab4684d92665..3732beee3741 100644 --- a/cranelift/filetests/filetests/isa/x64/struct-ret.clif +++ b/cranelift/filetests/filetests/isa/x64/struct-ret.clif @@ -12,8 +12,7 @@ block0(v0: i64): ; pushq %rbp ; movq %rsp, %rbp ; block0: -; movl $42, %eax -; movq %rax, 0(%rdi) +; movq $42, 0(%rdi) ; movq %rdi, %rax ; movq %rbp, %rsp ; popq %rbp @@ -24,8 +23,7 @@ block0(v0: i64): ; pushq %rbp ; movq %rsp, %rbp ; block1: ; offset 0x4 -; movl $0x2a, %eax -; movq %rax, (%rdi) ; trap: heap_oob +; movq $0x2a, (%rdi) ; trap: heap_oob ; movq %rdi, %rax ; movq %rbp, %rsp ; popq %rbp diff --git a/winch/codegen/src/isa/x64/asm.rs b/winch/codegen/src/isa/x64/asm.rs index 9c9a300f9ee7..f5c025cede0e 100644 --- a/winch/codegen/src/isa/x64/asm.rs +++ b/winch/codegen/src/isa/x64/asm.rs @@ -251,13 +251,13 @@ impl Assembler { } /// Immediate-to-memory move. - pub fn mov_im(&mut self, src: u64, addr: &Address, size: OperandSize) { + pub fn mov_im(&mut self, src: i32, addr: &Address, size: OperandSize) { assert!(addr.is_offset()); let dst = Self::to_synthetic_amode(addr, &mut self.pool, &mut self.constants, &mut self.buffer); self.emit(Inst::MovImmM { size: size.into(), - simm64: src, + simm32: src, dst, }); } diff --git a/winch/codegen/src/isa/x64/masm.rs b/winch/codegen/src/isa/x64/masm.rs index af39b793ed8c..ee83b6920cbd 100644 --- a/winch/codegen/src/isa/x64/masm.rs +++ b/winch/codegen/src/isa/x64/masm.rs @@ -114,8 +114,13 @@ impl Masm for MacroAssembler { fn store(&mut self, src: RegImm, dst: Address, size: OperandSize) { match src { RegImm::Imm(imm) => match imm { - I::I32(v) => self.asm.mov_im(v as u64, &dst, size), - I::I64(v) => self.asm.mov_im(v, &dst, size), + I::I32(v) => self.asm.mov_im(v as i32, &dst, size), + I::I64(v) => match v.try_into() { + Ok(v) => self.asm.mov_im(v, &dst, size), + Err(_) => { + panic!("Immediate-to-memory moves require immediate operand to sign-extend to 64 bits."); + } + }, // Immediate to memory moves are currently only used // to zero a memory range, which only involves // ints. See [`MacroAssembler::zero_mem_range`]. From 4dbc1f6a2fdc156e8af3f400ea1be541933b54ce Mon Sep 17 00:00:00 2001 From: Afonso Bordado Date: Mon, 11 Sep 2023 15:36:47 +0100 Subject: [PATCH 06/14] riscv64: Remove support for fixed offset jumps from Jump instructions (#6988) * riscv64: Use `MachLabel` for Jal * riscv64: Use `Inst::gen_jump` for jumps This is mostly a personal preference. It emits the exact same code. * riscv64: Use VecMachLabel on BrTable * riscv64: Remove `BranchTarget::Offset` arm Replaces it with `Fallthrough` which works the same way with a fixed offset of 0. * riscv64: Rename `BranchTarget` It is now only used for `CondBr` * riscv64: Panic on fallthrough taken target in condbr --- cranelift/codegen/src/isa/riscv64/inst.isle | 52 ++-- .../codegen/src/isa/riscv64/inst/emit.rs | 284 ++++++------------ cranelift/codegen/src/isa/riscv64/inst/mod.rs | 60 ++-- .../codegen/src/isa/riscv64/lower/isle.rs | 19 +- 4 files changed, 145 insertions(+), 270 deletions(-) diff --git a/cranelift/codegen/src/isa/riscv64/inst.isle b/cranelift/codegen/src/isa/riscv64/inst.isle index 3d919104af9e..6a3934b591e9 100644 --- a/cranelift/codegen/src/isa/riscv64/inst.isle +++ b/cranelift/codegen/src/isa/riscv64/inst.isle @@ -132,11 +132,11 @@ (Jal ;; (rd WritableReg) don't use - (dest BranchTarget)) + (label MachLabel)) (CondBr - (taken BranchTarget) - (not_taken BranchTarget) + (taken CondBrTarget) + (not_taken CondBrTarget) (kind IntegerCompare)) ;; Load an inline symbol reference. @@ -227,7 +227,7 @@ (index Reg) (tmp1 WritableReg) (tmp2 WritableReg) - (targets VecBranchTarget)) + (targets VecMachLabel)) ;; atomic compare and set operation (AtomicCas @@ -763,7 +763,6 @@ ;;;; lowest four bit are used. (type FenceReq (primitive u8)) -(type VecBranchTarget (primitive VecBranchTarget)) (type BoxCallInfo (primitive BoxCallInfo)) (type BoxCallIndInfo (primitive BoxCallIndInfo)) (type BoxReturnCallInfo (primitive BoxReturnCallInfo)) @@ -777,7 +776,7 @@ (type Imm5 (primitive Imm5)) (type Imm20 (primitive Imm20)) (type Imm3 (primitive Imm3)) -(type BranchTarget (primitive BranchTarget)) +(type CondBrTarget (primitive CondBrTarget)) (type OptionFloatRoundingMode (primitive OptionFloatRoundingMode)) (type VecU8 (primitive VecU8)) (type AMO (primitive AMO)) @@ -2660,21 +2659,6 @@ (test XReg (rv_srli sum (imm12_const (ty_bits ty))))) (value_regs sum test))) -(decl label_to_br_target (MachLabel) BranchTarget) -(extern constructor label_to_br_target label_to_br_target) - -(decl gen_jump (MachLabel) MInst) -(rule - (gen_jump v) - (MInst.Jal (label_to_br_target v))) - -(decl vec_label_get (VecMachLabel u8) MachLabel ) -(extern constructor vec_label_get vec_label_get) - -(decl partial lower_branch (Inst VecMachLabel) Unit) -(rule (lower_branch (jump _) targets ) - (emit_side_effect (SideEffectNoResult.Inst (gen_jump (vec_label_get targets 0))))) - ;;; cc a b targets Type (decl lower_br_icmp (IntCC ValueRegs ValueRegs VecMachLabel Type) Unit) (extern constructor lower_br_icmp lower_br_icmp) @@ -2718,6 +2702,17 @@ (hi XReg (value_regs_get regs 1))) (rv_or lo hi))) + +(decl label_to_br_target (MachLabel) CondBrTarget) +(extern constructor label_to_br_target label_to_br_target) + +(decl vec_label_get (VecMachLabel u8) MachLabel) +(extern constructor vec_label_get vec_label_get) + +(decl partial lower_branch (Inst VecMachLabel) Unit) +(rule (lower_branch (jump _) targets ) + (emit_side_effect (SideEffectNoResult.Inst (MInst.Jal (vec_label_get targets 0))))) + ;; Default behavior for branching based on an input value. (rule (lower_branch (brif v @ (value_type ty) _ _) targets) @@ -2739,23 +2734,22 @@ (rule 1 (lower_branch (brif (maybe_uextend (fcmp cc a @ (value_type ty) b)) _ _) targets) (if-let $true (floatcc_unordered cc)) - (let ((then BranchTarget (label_to_br_target (vec_label_get targets 0))) - (else BranchTarget (label_to_br_target (vec_label_get targets 1)))) + (let ((then CondBrTarget (label_to_br_target (vec_label_get targets 0))) + (else CondBrTarget (label_to_br_target (vec_label_get targets 1)))) (emit_side_effect (cond_br (emit_fcmp (floatcc_complement cc) ty a b) else then)))) (rule 1 (lower_branch (brif (maybe_uextend (fcmp cc a @ (value_type ty) b)) _ _) targets) (if-let $false (floatcc_unordered cc)) - (let ((then BranchTarget (label_to_br_target (vec_label_get targets 0))) - (else BranchTarget (label_to_br_target (vec_label_get targets 1)))) + (let ((then CondBrTarget (label_to_br_target (vec_label_get targets 0))) + (else CondBrTarget (label_to_br_target (vec_label_get targets 1)))) (emit_side_effect (cond_br (emit_fcmp cc ty a b) then else)))) -;;; + (decl lower_br_table (Reg VecMachLabel) Unit) (extern constructor lower_br_table lower_br_table) -(rule - (lower_branch (br_table index _) targets) +(rule (lower_branch (br_table index _) targets) (lower_br_table index targets)) (decl load_ra () Reg) @@ -2983,7 +2977,7 @@ (rule (cmp_result_invert result) (CmpResult.Result result $true)) ;; Consume a CmpResult, producing a branch on its result. -(decl cond_br (CmpResult BranchTarget BranchTarget) SideEffectNoResult) +(decl cond_br (CmpResult CondBrTarget CondBrTarget) SideEffectNoResult) (rule (cond_br cmp then else) (SideEffectNoResult.Inst (MInst.CondBr then else (cmp_integer_compare cmp)))) diff --git a/cranelift/codegen/src/isa/riscv64/inst/emit.rs b/cranelift/codegen/src/isa/riscv64/inst/emit.rs index 049c4bf49dbc..fdab6b32ff3f 100644 --- a/cranelift/codegen/src/isa/riscv64/inst/emit.rs +++ b/cranelift/codegen/src/isa/riscv64/inst/emit.rs @@ -174,8 +174,8 @@ impl Inst { tmp: Writable, rs: Reg, ty: Type, - taken: BranchTarget, - not_taken: BranchTarget, + taken: CondBrTarget, + not_taken: CondBrTarget, ) -> SmallInstVec { let mut insts = SmallInstVec::new(); let class_op = if ty == F32 { @@ -211,8 +211,8 @@ impl Inst { cc: IntCC, a: ValueRegs, b: ValueRegs, - taken: BranchTarget, - not_taken: BranchTarget, + taken: CondBrTarget, + not_taken: CondBrTarget, ty: Type, ) -> SmallInstVec { let mut insts = SmallInstVec::new(); @@ -248,7 +248,7 @@ impl Inst { // then we can go to not_taken otherwise fallthrough. insts.push(Inst::CondBr { taken: not_taken, - not_taken: BranchTarget::zero(), + not_taken: CondBrTarget::Fallthrough, kind: high(IntCC::NotEqual), }); // the rest part. @@ -265,7 +265,7 @@ impl Inst { // we can goto the taken part , otherwise fallthrought. insts.push(Inst::CondBr { taken, - not_taken: BranchTarget::zero(), // no branch + not_taken: CondBrTarget::Fallthrough, // no branch kind: high(IntCC::NotEqual), }); @@ -286,13 +286,13 @@ impl Inst { // insts.push(Inst::CondBr { taken, - not_taken: BranchTarget::zero(), + not_taken: CondBrTarget::Fallthrough, kind: high(cc.without_equal()), }); // insts.push(Inst::CondBr { taken: not_taken, - not_taken: BranchTarget::zero(), + not_taken: CondBrTarget::Fallthrough, kind: high(IntCC::NotEqual), }); insts.push(Inst::CondBr { @@ -465,10 +465,7 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); // Jump over the inline pool - Inst::Jal { - dest: BranchTarget::Label(label_end), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_end).emit(&[], sink, emit_info, state); // Emit the inline data sink.bind_label(label_data, &mut state.ctrl_plane); @@ -898,35 +895,10 @@ impl MachInstEmit for Inst { start_off = sink.cur_offset(); } - &Inst::Jal { dest } => { - let code: u32 = 0b1101111; - match dest { - BranchTarget::Label(lable) => { - sink.use_label_at_offset(start_off, lable, LabelUse::Jal20); - sink.add_uncond_branch(start_off, start_off + 4, lable); - sink.put4(code); - } - BranchTarget::ResolvedOffset(offset) => { - let offset = offset as i64; - if offset != 0 { - if LabelUse::Jal20.offset_in_range(offset) { - let mut code = code.to_le_bytes(); - LabelUse::Jal20.patch_raw_offset(&mut code, offset); - sink.put_data(&code[..]); - } else { - Inst::construct_auipc_and_jalr( - None, - writable_spilltmp_reg(), - offset, - ) - .into_iter() - .for_each(|i| i.emit(&[], sink, emit_info, state)); - } - } else { - // CondBr often generate Jal {dest : 0}, means otherwise no jump. - } - } - } + &Inst::Jal { label } => { + sink.use_label_at_offset(start_off, label, LabelUse::Jal20); + sink.add_uncond_branch(start_off, start_off + 4, label); + sink.put4(0b1101111); } &Inst::CondBr { taken, @@ -936,36 +908,22 @@ impl MachInstEmit for Inst { kind.rs1 = allocs.next(kind.rs1); kind.rs2 = allocs.next(kind.rs2); match taken { - BranchTarget::Label(label) => { + CondBrTarget::Label(label) => { let code = kind.emit(); let code_inverse = kind.inverse().emit().to_le_bytes(); sink.use_label_at_offset(start_off, label, LabelUse::B12); sink.add_cond_branch(start_off, start_off + 4, label, &code_inverse); sink.put4(code); } - BranchTarget::ResolvedOffset(offset) => { - assert!(offset != 0); - if LabelUse::B12.offset_in_range(offset as i64) { - let code = kind.emit(); - let mut code = code.to_le_bytes(); - LabelUse::B12.patch_raw_offset(&mut code, offset as i64); - sink.put_data(&code[..]) - } else { - let mut code = kind.emit().to_le_bytes(); - // jump over the condbr , 4 bytes. - LabelUse::B12.patch_raw_offset(&mut code[..], 4); - sink.put_data(&code[..]); - Inst::construct_auipc_and_jalr( - None, - writable_spilltmp_reg(), - offset as i64, - ) - .into_iter() - .for_each(|i| i.emit(&[], sink, emit_info, state)); - } - } + CondBrTarget::Fallthrough => panic!("Cannot fallthrough in taken target"), } - Inst::Jal { dest: not_taken }.emit(&[], sink, emit_info, state); + + match not_taken { + CondBrTarget::Label(label) => { + Inst::gen_jump(label).emit(&[], sink, emit_info, state) + } + CondBrTarget::Fallthrough => {} + }; } &Inst::Mov { rd, rm, ty } => { @@ -1081,8 +1039,8 @@ impl MachInstEmit for Inst { .iter() .for_each(|i| i.emit(&[], sink, emit_info, state)); Inst::CondBr { - taken: BranchTarget::Label(label_compute_target), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_compute_target), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::UnsignedLessThan, rs1: ext_index.to_reg(), @@ -1090,11 +1048,8 @@ impl MachInstEmit for Inst { }, } .emit(&[], sink, emit_info, state); - sink.use_label_at_offset( - sink.cur_offset(), - default_target.as_label().unwrap(), - LabelUse::PCRel32, - ); + + sink.use_label_at_offset(sink.cur_offset(), default_target, LabelUse::PCRel32); Inst::construct_auipc_and_jalr(None, tmp2, 0) .iter() .for_each(|i| i.emit(&[], sink, emit_info, state)); @@ -1154,11 +1109,7 @@ impl MachInstEmit for Inst { // Emit the jumps back to back for target in targets.iter() { - sink.use_label_at_offset( - sink.cur_offset(), - target.as_label().unwrap(), - LabelUse::PCRel32, - ); + sink.use_label_at_offset(sink.cur_offset(), *target, LabelUse::PCRel32); Inst::construct_auipc_and_jalr(None, tmp2, 0) .iter() @@ -1301,8 +1252,8 @@ impl MachInstEmit for Inst { let mut insts = SmallInstVec::new(); let label_false = sink.get_label(); insts.push(Inst::CondBr { - taken: BranchTarget::Label(label_false), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_false), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::Equal, rs1: condition, @@ -1313,9 +1264,7 @@ impl MachInstEmit for Inst { // select the first value insts.extend(gen_moves(&dst[..], x.regs())); let label_jump_over = sink.get_label(); - insts.push(Inst::Jal { - dest: BranchTarget::Label(label_jump_over), - }); + insts.push(Inst::gen_jump(label_jump_over)); // here is false insts .drain(..) @@ -1354,8 +1303,8 @@ impl MachInstEmit for Inst { cc, a, b, - BranchTarget::Label(label_true), - BranchTarget::Label(label_false), + CondBrTarget::Label(label_true), + CondBrTarget::Label(label_false), ty, ) .into_iter() @@ -1363,10 +1312,7 @@ impl MachInstEmit for Inst { sink.bind_label(label_true, &mut state.ctrl_plane); Inst::load_imm12(rd, Imm12::TRUE).emit(&[], sink, emit_info, state); - Inst::Jal { - dest: BranchTarget::Label(label_end), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_end).emit(&[], sink, emit_info, state); sink.bind_label(label_false, &mut state.ctrl_plane); Inst::load_imm12(rd, Imm12::FALSE).emit(&[], sink, emit_info, state); sink.bind_label(label_end, &mut state.ctrl_plane); @@ -1423,8 +1369,8 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); } Inst::CondBr { - taken: BranchTarget::Label(fail_label), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(fail_label), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::NotEqual, rs1: e, @@ -1460,8 +1406,8 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); // check is our value stored. Inst::CondBr { - taken: BranchTarget::Label(cas_lebel), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(cas_lebel), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::NotEqual, rs1: t0.to_reg(), @@ -1602,18 +1548,15 @@ impl MachInstEmit for Inst { }, ValueRegs::one(dst.to_reg()), ValueRegs::one(x), - BranchTarget::Label(label_select_dst), - BranchTarget::zero(), + CondBrTarget::Label(label_select_dst), + CondBrTarget::Fallthrough, ty, ) .iter() .for_each(|i| i.emit(&[], sink, emit_info, state)); // here we select x. Inst::gen_move(t0, x, I64).emit(&[], sink, emit_info, state); - Inst::Jal { - dest: BranchTarget::Label(label_select_done), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_select_done).emit(&[], sink, emit_info, state); sink.bind_label(label_select_dst, &mut state.ctrl_plane); Inst::gen_move(t0, dst.to_reg(), I64).emit(&[], sink, emit_info, state); sink.bind_label(label_select_done, &mut state.ctrl_plane); @@ -1672,8 +1615,8 @@ impl MachInstEmit for Inst { // if store is not ok,retry. Inst::CondBr { - taken: BranchTarget::Label(retry), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(retry), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::NotEqual, rs1: t0.to_reg(), @@ -1700,8 +1643,8 @@ impl MachInstEmit for Inst { op.to_int_cc(), x, y, - BranchTarget::Label(label_true), - BranchTarget::Label(label_false), + CondBrTarget::Label(label_true), + CondBrTarget::Label(label_false), ty, ) .into_iter() @@ -1760,10 +1703,7 @@ impl MachInstEmit for Inst { // here is false , use rs2 Inst::gen_move(rd, rs2, ty).emit(&[], sink, emit_info, state); // and jump over - Inst::Jal { - dest: BranchTarget::Label(label_jump_over), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_jump_over).emit(&[], sink, emit_info, state); // here condition is true , use rs1 sink.bind_label(label_true, &mut state.ctrl_plane); Inst::gen_move(rd, rs1, ty).emit(&[], sink, emit_info, state); @@ -1787,8 +1727,8 @@ impl MachInstEmit for Inst { Inst::emit_not_nan(rd, rs, in_type).emit(&[], sink, emit_info, state); // jump to nan. Inst::CondBr { - taken: BranchTarget::Label(label_nan), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_nan), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::Equal, rs2: zero_reg(), @@ -1921,10 +1861,7 @@ impl MachInstEmit for Inst { } // I already have the result,jump over. - Inst::Jal { - dest: BranchTarget::Label(label_jump_over), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_jump_over).emit(&[], sink, emit_info, state); // here is nan , move 0 into rd register sink.bind_label(label_nan, &mut state.ctrl_plane); if is_sat { @@ -1960,10 +1897,7 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); // Jump over the data - Inst::Jal { - dest: BranchTarget::Label(label_end), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_end).emit(&[], sink, emit_info, state); sink.bind_label(label_data, &mut state.ctrl_plane); sink.add_reloc(Reloc::Abs8, name.as_ref(), offset); @@ -1982,8 +1916,8 @@ impl MachInstEmit for Inst { let label_trap = sink.get_label(); let label_jump_over = sink.get_label(); Inst::CondBr { - taken: BranchTarget::Label(label_trap), - not_taken: BranchTarget::Label(label_jump_over), + taken: CondBrTarget::Label(label_trap), + not_taken: CondBrTarget::Label(label_jump_over), kind: IntegerCompare { kind: cc, rs1, rs2 }, } .emit(&[], sink, emit_info, state); @@ -1997,8 +1931,8 @@ impl MachInstEmit for Inst { let label_trap = sink.get_label(); let label_jump_over = sink.get_label(); Inst::CondBr { - taken: BranchTarget::Label(label_trap), - not_taken: BranchTarget::Label(label_jump_over), + taken: CondBrTarget::Label(label_trap), + not_taken: CondBrTarget::Label(label_jump_over), kind: IntegerCompare { kind: IntCC::NotEqual, rs1: test, @@ -2079,8 +2013,8 @@ impl MachInstEmit for Inst { // check if is nan. Inst::emit_not_nan(int_tmp, rs, ty).emit(&[], sink, emit_info, state); Inst::CondBr { - taken: BranchTarget::Label(label_nan), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_nan), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::Equal, rs1: int_tmp.to_reg(), @@ -2135,8 +2069,8 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); Inst::CondBr { - taken: BranchTarget::Label(label_x), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_x), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::NotEqual, rs1: int_tmp.to_reg(), @@ -2175,10 +2109,7 @@ impl MachInstEmit for Inst { } .emit(&[], sink, emit_info, state); // jump over. - Inst::Jal { - dest: BranchTarget::Label(label_jump_over), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_jump_over).emit(&[], sink, emit_info, state); // here is nan. sink.bind_label(label_nan, &mut state.ctrl_plane); Inst::FpuRRR { @@ -2193,10 +2124,7 @@ impl MachInstEmit for Inst { rs2: rs, } .emit(&[], sink, emit_info, state); - Inst::Jal { - dest: BranchTarget::Label(label_jump_over), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_jump_over).emit(&[], sink, emit_info, state); // here select origin x. sink.bind_label(label_x, &mut state.ctrl_plane); Inst::gen_move(rd, rs, ty).emit(&[], sink, emit_info, state); @@ -2220,8 +2148,8 @@ impl MachInstEmit for Inst { // check if rs1 is nan. Inst::emit_not_nan(tmp, rs1, ty).emit(&[], sink, emit_info, state); Inst::CondBr { - taken: BranchTarget::Label(label_nan), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_nan), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::Equal, rs1: tmp.to_reg(), @@ -2232,8 +2160,8 @@ impl MachInstEmit for Inst { // check if rs2 is nan. Inst::emit_not_nan(tmp, rs2, ty).emit(&[], sink, emit_info, state); Inst::CondBr { - taken: BranchTarget::Label(label_nan), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_nan), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::Equal, rs1: tmp.to_reg(), @@ -2260,15 +2188,15 @@ impl MachInstEmit for Inst { tmp, rs1, ty, - BranchTarget::Label(label_done), - BranchTarget::zero(), + CondBrTarget::Label(label_done), + CondBrTarget::Fallthrough, ); insts.extend(Inst::emit_if_float_not_zero( tmp, rs2, ty, - BranchTarget::Label(label_done), - BranchTarget::zero(), + CondBrTarget::Label(label_done), + CondBrTarget::Fallthrough, )); insts .iter() @@ -2311,10 +2239,7 @@ impl MachInstEmit for Inst { sink.bind_label(label_done, &mut state.ctrl_plane); } // we have the reuslt,jump over. - Inst::Jal { - dest: BranchTarget::Label(label_jump_over), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_jump_over).emit(&[], sink, emit_info, state); // here is nan. sink.bind_label(label_nan, &mut state.ctrl_plane); op.snan_bits(tmp, ty) @@ -2363,8 +2288,8 @@ impl MachInstEmit for Inst { let label_loop = sink.get_label(); sink.bind_label(label_loop, &mut state.ctrl_plane); Inst::CondBr { - taken: BranchTarget::Label(label_done), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_done), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::SignedLessThanOrEqual, rs1: step.to_reg(), @@ -2383,8 +2308,8 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); let label_over = sink.get_label(); Inst::CondBr { - taken: BranchTarget::Label(label_over), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_over), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::Equal, rs1: zero_reg(), @@ -2417,10 +2342,7 @@ impl MachInstEmit for Inst { imm12: Imm12::from_bits(1), } .emit(&[], sink, emit_info, state); - Inst::Jal { - dest: BranchTarget::Label(label_loop), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_loop).emit(&[], sink, emit_info, state); } sink.bind_label(label_done, &mut state.ctrl_plane); } @@ -2438,8 +2360,8 @@ impl MachInstEmit for Inst { let label_loop = sink.get_label(); sink.bind_label(label_loop, &mut state.ctrl_plane); Inst::CondBr { - taken: BranchTarget::Label(label_done), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_done), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::SignedLessThan, rs1: step.to_reg(), @@ -2469,6 +2391,7 @@ impl MachInstEmit for Inst { rs2: spilltmp_reg(), } .emit(&[], sink, emit_info, state); + { // reset step Inst::AluRRImm12 { @@ -2487,11 +2410,9 @@ impl MachInstEmit for Inst { } .emit(&[], sink, emit_info, state); // loop. - Inst::Jal { - dest: BranchTarget::Label(label_loop), - } + Inst::gen_jump(label_loop).emit(&[], sink, emit_info, state); } - .emit(&[], sink, emit_info, state); + sink.bind_label(label_done, &mut state.ctrl_plane); } &Inst::Cltz { @@ -2530,8 +2451,8 @@ impl MachInstEmit for Inst { let label_loop = sink.get_label(); sink.bind_label(label_loop, &mut state.ctrl_plane); Inst::CondBr { - taken: BranchTarget::Label(label_done), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_done), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::SignedLessThanOrEqual, rs1: step.to_reg(), @@ -2549,8 +2470,8 @@ impl MachInstEmit for Inst { } .emit(&[], sink, emit_info, state); Inst::CondBr { - taken: BranchTarget::Label(label_done), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_done), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::NotEqual, rs1: zero_reg(), @@ -2586,10 +2507,7 @@ impl MachInstEmit for Inst { imm12: Imm12::from_bits(1), } .emit(&[], sink, emit_info, state); - Inst::Jal { - dest: BranchTarget::Label(label_loop), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_loop).emit(&[], sink, emit_info, state); } sink.bind_label(label_done, &mut state.ctrl_plane); } @@ -2635,8 +2553,8 @@ impl MachInstEmit for Inst { let label_loop = sink.get_label(); sink.bind_label(label_loop, &mut state.ctrl_plane); Inst::CondBr { - taken: BranchTarget::Label(label_done), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_done), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::SignedLessThanOrEqual, rs1: step.to_reg(), @@ -2655,8 +2573,8 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); let label_over = sink.get_label(); Inst::CondBr { - taken: BranchTarget::Label(label_over), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_over), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::Equal, rs1: zero_reg(), @@ -2709,8 +2627,8 @@ impl MachInstEmit for Inst { } .emit(&[], sink, emit_info, state); Inst::CondBr { - taken: BranchTarget::Label(label_sll_1), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_sll_1), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::NotEqual, rs1: spilltmp_reg2(), @@ -2725,10 +2643,7 @@ impl MachInstEmit for Inst { imm12: Imm12::from_bits(15), } .emit(&[], sink, emit_info, state); - Inst::Jal { - dest: BranchTarget::Label(label_over), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_over).emit(&[], sink, emit_info, state); sink.bind_label(label_sll_1, &mut state.ctrl_plane); Inst::AluRRImm12 { alu_op: AluOPRRI::Slli, @@ -2739,10 +2654,7 @@ impl MachInstEmit for Inst { .emit(&[], sink, emit_info, state); sink.bind_label(label_over, &mut state.ctrl_plane); } - Inst::Jal { - dest: BranchTarget::Label(label_loop), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(label_loop).emit(&[], sink, emit_info, state); } sink.bind_label(label_done, &mut state.ctrl_plane); } @@ -2763,8 +2675,8 @@ impl MachInstEmit for Inst { let label_done = sink.get_label(); sink.bind_label(loop_start, &mut state.ctrl_plane); Inst::CondBr { - taken: BranchTarget::Label(label_done), - not_taken: BranchTarget::zero(), + taken: CondBrTarget::Label(label_done), + not_taken: CondBrTarget::Fallthrough, kind: IntegerCompare { kind: IntCC::UnsignedLessThanOrEqual, rs1: step.to_reg(), @@ -2795,10 +2707,7 @@ impl MachInstEmit for Inst { rs2: guard_size_tmp.to_reg(), } .emit(&[], sink, emit_info, state); - Inst::Jal { - dest: BranchTarget::Label(loop_start), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(loop_start).emit(&[], sink, emit_info, state); sink.bind_label(label_done, &mut state.ctrl_plane); } &Inst::VecAluRRRImm5 { @@ -3057,10 +2966,7 @@ fn emit_return_call_common_sequence( let space_needed = insts * u32::try_from(Inst::UNCOMPRESSED_INSTRUCTION_SIZE).unwrap(); if sink.island_needed(space_needed) { let jump_around_label = sink.get_label(); - Inst::Jal { - dest: BranchTarget::Label(jump_around_label), - } - .emit(&[], sink, emit_info, state); + Inst::gen_jump(jump_around_label).emit(&[], sink, emit_info, state); sink.emit_island(space_needed + 4, &mut state.ctrl_plane); sink.bind_label(jump_around_label, &mut state.ctrl_plane); } diff --git a/cranelift/codegen/src/isa/riscv64/inst/mod.rs b/cranelift/codegen/src/isa/riscv64/inst/mod.rs index ec56b568d507..f5d4e2b53e5b 100644 --- a/cranelift/codegen/src/isa/riscv64/inst/mod.rs +++ b/cranelift/codegen/src/isa/riscv64/inst/mod.rs @@ -46,7 +46,6 @@ use std::fmt::{Display, Formatter}; pub(crate) type OptionReg = Option; pub(crate) type OptionImm12 = Option; -pub(crate) type VecBranchTarget = Vec; pub(crate) type OptionUimm5 = Option; pub(crate) type OptionFloatRoundingMode = Option; pub(crate) type VecU8 = Vec; @@ -103,53 +102,35 @@ pub struct ReturnCallInfo { pub new_stack_arg_size: u32, } -/// A branch target. Either unresolved (basic-block index) or resolved (offset -/// from end of current instruction). +/// A conditional branch target. #[derive(Clone, Copy, Debug, PartialEq, Eq)] -pub enum BranchTarget { +pub enum CondBrTarget { /// An unresolved reference to a Label, as passed into /// `lower_branch_group()`. Label(MachLabel), - /// A fixed PC offset. - ResolvedOffset(i32), + /// No jump; fall through to the next instruction. + Fallthrough, } -impl BranchTarget { +impl CondBrTarget { /// Return the target's label, if it is a label-based target. pub(crate) fn as_label(self) -> Option { match self { - BranchTarget::Label(l) => Some(l), + CondBrTarget::Label(l) => Some(l), _ => None, } } - /// offset zero. - #[inline] - pub(crate) fn zero() -> Self { - Self::ResolvedOffset(0) - } - - #[inline] - pub(crate) fn is_zero(self) -> bool { - match self { - BranchTarget::Label(_) => false, - BranchTarget::ResolvedOffset(off) => off == 0, - } - } - #[inline] - pub(crate) fn as_offset(self) -> Option { - match self { - BranchTarget::Label(_) => None, - BranchTarget::ResolvedOffset(off) => Some(off), - } + pub(crate) fn is_fallthrouh(&self) -> bool { + self == &CondBrTarget::Fallthrough } } -impl Display for BranchTarget { +impl Display for CondBrTarget { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { match self { - BranchTarget::Label(l) => write!(f, "{}", l.to_string()), - BranchTarget::ResolvedOffset(off) => write!(f, "{}", off), + CondBrTarget::Label(l) => write!(f, "{}", l.to_string()), + CondBrTarget::Fallthrough => write!(f, "0"), } } } @@ -480,7 +461,10 @@ fn riscv64_get_operands VReg>(inst: &Inst, collector: &mut Operan &Inst::TrapIf { test, .. } => { collector.reg_use(test); } - &Inst::Jal { .. } => {} + &Inst::Jal { .. } => { + // JAL technically has a rd register, but we currently always + // hardcode it to x0. + } &Inst::CondBr { kind, .. } => { collector.reg_use(kind.rs1); collector.reg_use(kind.rs2); @@ -967,9 +951,7 @@ impl MachInst for Inst { } fn gen_jump(target: MachLabel) -> Inst { - Inst::Jal { - dest: BranchTarget::Label(target), - } + Inst::Jal { label: target } } fn worst_case_size() -> CodeOffset { @@ -1360,7 +1342,6 @@ impl Inst { tmp2, ref targets, } => { - let targets: Vec<_> = targets.iter().map(|x| x.as_label().unwrap()).collect(); format!( "{} {},{}##tmp1={},tmp2={}", "br_table", @@ -1661,8 +1642,8 @@ impl Inst { let rs2 = format_reg(rs2, allocs); format!("trap_ifc {}##({} {} {})", trap_code, rs1, cc, rs2) } - &MInst::Jal { dest, .. } => { - format!("{} {}", "j", dest) + &MInst::Jal { label } => { + format!("j {}", label.to_string()) } &MInst::CondBr { taken, @@ -1672,9 +1653,8 @@ impl Inst { } => { let rs1 = format_reg(kind.rs1, allocs); let rs2 = format_reg(kind.rs2, allocs); - if not_taken.is_zero() && taken.as_label().is_none() { - let off = taken.as_offset().unwrap(); - format!("{} {},{},{}", kind.op_name(), rs1, rs2, off) + if not_taken.is_fallthrouh() && taken.as_label().is_none() { + format!("{} {},{},0", kind.op_name(), rs1, rs2) } else { let x = format!( "{} {},{},taken({}),not_taken({})", diff --git a/cranelift/codegen/src/isa/riscv64/lower/isle.rs b/cranelift/codegen/src/isa/riscv64/lower/isle.rs index 81ecba07d7c4..eb72f3413b81 100644 --- a/cranelift/codegen/src/isa/riscv64/lower/isle.rs +++ b/cranelift/codegen/src/isa/riscv64/lower/isle.rs @@ -201,8 +201,8 @@ impl generated_code::Context for RV64IsleContext<'_, '_, MInst, Riscv64Backend> *cc, a, self.int_zero_reg(ty), - BranchTarget::Label(targets[0]), - BranchTarget::Label(targets[1]), + CondBrTarget::Label(targets[0]), + CondBrTarget::Label(targets[1]), ty, ) .iter() @@ -218,8 +218,8 @@ impl generated_code::Context for RV64IsleContext<'_, '_, MInst, Riscv64Backend> ) -> Unit { let test = generated_code::constructor_lower_icmp(self, cc, a, b, ty); self.emit(&MInst::CondBr { - taken: BranchTarget::Label(targets[0]), - not_taken: BranchTarget::Label(targets[1]), + taken: CondBrTarget::Label(targets[0]), + not_taken: CondBrTarget::Label(targets[1]), kind: IntegerCompare { kind: IntCC::NotEqual, rs1: test, @@ -254,8 +254,8 @@ impl generated_code::Context for RV64IsleContext<'_, '_, MInst, Riscv64Backend> val[x as usize] } - fn label_to_br_target(&mut self, label: MachLabel) -> BranchTarget { - BranchTarget::Label(label) + fn label_to_br_target(&mut self, label: MachLabel) -> CondBrTarget { + CondBrTarget::Label(label) } fn vec_writable_clone(&mut self, v: &VecWritableReg) -> VecWritableReg { @@ -514,16 +514,11 @@ impl generated_code::Context for RV64IsleContext<'_, '_, MInst, Riscv64Backend> fn lower_br_table(&mut self, index: Reg, targets: &VecMachLabel) -> Unit { let tmp1 = self.temp_writable_reg(I64); let tmp2 = self.temp_writable_reg(I64); - let targets: Vec = targets - .into_iter() - .copied() - .map(BranchTarget::Label) - .collect(); self.emit(&MInst::BrTable { index, tmp1, tmp2, - targets, + targets: targets.clone(), }); } From 86a6f5c59e16e7371ef6286f53adc5e85467fc59 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 11 Sep 2023 08:53:24 -0700 Subject: [PATCH 07/14] Remove recursion from AMode lowering rules (#6968) * x64: Remove recursion in `to_amode` helper This commit removes the recursion present in the x64 backend's `to_amode` and `to_amode_add` helpers. The recursion is currently unbounded and controlled by user input meaning it's not too hard to craft user input which triggers stack overflow in the host. By removing recursion there's no need to worry about this since the stack depth will never be hit. The main concern with removing recursion is that code quality may not be quite as good any more. The purpose of the recursion here is to "hunt for constants" and update the immediate `Offset32`, and now without recursion only at most one constant is found and folded instead of an arbitrary number of constants as before. This should continue to produce the same code as before so long as optimizations are enabled, but without optimizations this will produce worse code than before. Note with a hypothetical mid-end optimization that does this constant folding for us the rules here could be further simplified to purely consider the shape of the input `Value` to amode computation without considering constants at all. * aarch64: Remove recursion from amode lowering rules Same as the prior commit for x64, but for aarch64 instead. Unlikely to be reachable from wasm due to this only being a part of addressing modes which are more strictly controlled in wasm (e.g. wasm addresses are always a zero-extended value added to a base, so it's not possible to have a long chain of constants at the top level in clif) --- cranelift/codegen/src/isa/aarch64/inst.isle | 49 ++++---- cranelift/codegen/src/isa/x64/inst.isle | 121 +++++++++++++------- tests/all/module.rs | 23 ++++ 3 files changed, 129 insertions(+), 64 deletions(-) diff --git a/cranelift/codegen/src/isa/aarch64/inst.isle b/cranelift/codegen/src/isa/aarch64/inst.isle index edfa31432ecf..2f3b614d4414 100644 --- a/cranelift/codegen/src/isa/aarch64/inst.isle +++ b/cranelift/codegen/src/isa/aarch64/inst.isle @@ -3088,19 +3088,36 @@ ;; at runtime plus the immediate offset `i32` provided. The `Type` here is used ;; to represent the size of the value being loaded or stored for offset scaling ;; if necessary. +;; +;; Note that this is broken up into two phases. In the first phase this attempts +;; to find constants within the `val` provided and fold them in to the `offset` +;; provided. Afterwards though the `amode_no_more_iconst` helper is used at +;; which pointer constants are no longer pattern-matched and instead only +;; various modes are generated. This in theory would not be necessary with +;; mid-end optimizations that fold constants into load/store immediate offsets +;; instead, but for now each backend needs to do this. (decl amode (Type Value i32) AMode) +(rule 0 (amode ty val offset) + (amode_no_more_iconst ty val offset)) +(rule 1 (amode ty (iadd x (iconst (simm32 y))) offset) + (if-let new_offset (s32_add_fallible y offset)) + (amode_no_more_iconst ty x new_offset)) +(rule 2 (amode ty (iadd (iconst (simm32 x)) y) offset) + (if-let new_offset (s32_add_fallible x offset)) + (amode_no_more_iconst ty y new_offset)) +(decl amode_no_more_iconst (Type Value i32) AMode) ;; Base case: move the `offset` into a register and add it to `val` via the ;; amode -(rule 0 (amode ty val offset) +(rule 0 (amode_no_more_iconst ty val offset) (AMode.RegReg val (imm $I64 (ImmExtend.Zero) (i64_as_u64 offset)))) ;; Optimize cases where the `offset` provided fits into a immediates of ;; various kinds of addressing modes. -(rule 1 (amode ty val offset) +(rule 1 (amode_no_more_iconst ty val offset) (if-let simm9 (simm9_from_i64 offset)) (AMode.Unscaled val simm9)) -(rule 2 (amode ty val offset) +(rule 2 (amode_no_more_iconst ty val offset) (if-let uimm12 (uimm12_scaled_from_i64 offset ty)) (AMode.UnsignedOffset val uimm12)) @@ -3111,15 +3128,15 @@ ;; instructions. Constants on the other hand added to the amode represent only ;; a single instruction folded in, so fewer instructions should be generated ;; with these higher priority than the rules above. -(rule 3 (amode ty (iadd x y) offset) +(rule 3 (amode_no_more_iconst ty (iadd x y) offset) (AMode.RegReg (amode_add x offset) y)) -(rule 4 (amode ty (iadd x (uextend y @ (value_type $I32))) offset) +(rule 4 (amode_no_more_iconst ty (iadd x (uextend y @ (value_type $I32))) offset) (AMode.RegExtended (amode_add x offset) y (ExtendOp.UXTW))) -(rule 4 (amode ty (iadd x (sextend y @ (value_type $I32))) offset) +(rule 4 (amode_no_more_iconst ty (iadd x (sextend y @ (value_type $I32))) offset) (AMode.RegExtended (amode_add x offset) y (ExtendOp.SXTW))) -(rule 5 (amode ty (iadd (uextend x @ (value_type $I32)) y) offset) +(rule 5 (amode_no_more_iconst ty (iadd (uextend x @ (value_type $I32)) y) offset) (AMode.RegExtended (amode_add y offset) x (ExtendOp.UXTW))) -(rule 5 (amode ty (iadd (sextend x @ (value_type $I32)) y) offset) +(rule 5 (amode_no_more_iconst ty (iadd (sextend x @ (value_type $I32)) y) offset) (AMode.RegExtended (amode_add y offset) x (ExtendOp.SXTW))) ;; `RegScaled*` rules where this matches an addition of an "index register" to a @@ -3129,10 +3146,10 @@ ;; Note that this can additionally bundle an extending operation but the ;; extension must happen before the shift. This will pattern-match the shift ;; first and then if that succeeds afterwards try to find an extend. -(rule 6 (amode ty (iadd x (ishl y (iconst (u64_from_imm64 n)))) offset) +(rule 6 (amode_no_more_iconst ty (iadd x (ishl y (iconst (u64_from_imm64 n)))) offset) (if-let $true (u64_eq (ty_bytes ty) (u64_shl 1 n))) (amode_reg_scaled (amode_add x offset) y ty)) -(rule 7 (amode ty (iadd (ishl y (iconst (u64_from_imm64 n))) x) offset) +(rule 7 (amode_no_more_iconst ty (iadd (ishl y (iconst (u64_from_imm64 n))) x) offset) (if-let $true (u64_eq (ty_bytes ty) (u64_shl 1 n))) (amode_reg_scaled (amode_add x offset) y ty)) @@ -3144,18 +3161,6 @@ (rule 2 (amode_reg_scaled base (sextend index @ (value_type $I32)) ty) (AMode.RegScaledExtended base index ty (ExtendOp.SXTW))) -;; Small optimizations where constants found in `iadd` are folded into the -;; `offset` immediate. -;; -;; NB: this should probably be done by mid-end optimizations rather than here -;; in the backend, but currently Cranelift doesn't do that. -(rule 8 (amode ty (iadd x (iconst (simm32 y))) offset) - (if-let new_offset (s32_add_fallible y offset)) - (amode ty x new_offset)) -(rule 9 (amode ty (iadd (iconst (simm32 x)) y) offset) - (if-let new_offset (s32_add_fallible x offset)) - (amode ty y new_offset)) - ;; Helper to add a 32-bit signed immediate to the register provided. This will ;; select an appropriate `add` instruction to use. (decl amode_add (Reg i32) Reg) diff --git a/cranelift/codegen/src/isa/x64/inst.isle b/cranelift/codegen/src/isa/x64/inst.isle index b772a17d96bc..40f478073a12 100644 --- a/cranelift/codegen/src/isa/x64/inst.isle +++ b/cranelift/codegen/src/isa/x64/inst.isle @@ -1034,56 +1034,93 @@ ;; Converts a `Value` and a static offset into an `Amode` for x64, attempting ;; to be as fancy as possible with offsets/registers/shifts/etc to make maximal ;; use of the x64 addressing modes. +;; +;; This is a bit subtle unfortunately due to a few constraints. This function +;; was originally written recursively but that can lead to stack overflow +;; for certain inputs due to the recursion being defined by user-controlled +;; input. This means that nowadays this function is not recursive and has a +;; specific structure to handle that. +;; +;; Additionally currently in CLIF all loads/stores have an `Offset32` immediate +;; to go with them, but the wasm lowering to CLIF doesn't use this meaning that +;; it's frequently 0. Additionally mid-end optimizations do not fold `iconst` +;; values into this `Offset32`, meaning that it's left up to backends to hunt +;; for constants for good codegen. That means that one important aspect of this +;; function is that it searches for constants to fold into the `Offset32` to +;; avoid unnecessary instructions. +;; +;; Note, though, that the "optimal addressing modes" are only guaranteed to be +;; generated if egraph-based optimizations have run. For example this will only +;; attempt to find one constant as opposed to many, and that'll only happen +;; with constant folding from optimizations. +;; +;; Finally there's two primary entry points for this function. One is this +;; function here, `to_amode,` and another is `to_amode_add`. The latter is used +;; by the lowering of `iadd` in the x64 backend to use the `lea` instruction +;; where the input is two `Value` operands instead of just one. Most of the +;; logic here is then deferred through `to_amode_add`. +;; +;; In the future if mid-end optimizations fold constants into `Offset32` then +;; this in theory can "simply" delegate to the `amode_imm_reg` helper, and +;; below can delegate to `amode_imm_reg_reg_shift`, or something like that. (decl to_amode (MemFlags Value Offset32) Amode) - -;; Base case, "just put it in a register" -(rule (to_amode flags base offset) - (Amode.ImmReg offset base flags)) - -;; Slightly-more-fancy case, if the address is the addition of two things then -;; delegate to the `to_amode_add` helper. +(rule 0 (to_amode flags base offset) + (amode_imm_reg flags base offset)) (rule 1 (to_amode flags (iadd x y) offset) - (to_amode_add flags x y offset)) + (to_amode_add flags x y offset)) ;; Same as `to_amode`, except that the base address is computed via the addition ;; of the two `Value` arguments provided. -(decl to_amode_add (MemFlags Value Value Offset32) Amode) - -;; Base case, "just put things in registers". Note that the shift value of 0 -;; here means `x + (y << 0)` which is the same as `x + y`. -(rule (to_amode_add flags x y offset) - (Amode.ImmRegRegShift offset x y 0 flags)) - -;; If the one of the arguments being added is itself a constant shift then -;; that can be modeled directly so long as the shift is a modestly small amount. -(rule 1 (to_amode_add flags x (ishl y (iconst (uimm8 shift))) offset) - (if (u32_lteq (u8_as_u32 shift) 3)) - (Amode.ImmRegRegShift offset x y shift flags)) -(rule 2 (to_amode_add flags (ishl y (iconst (uimm8 shift))) x offset) - (if (u32_lteq (u8_as_u32 shift) 3)) - (Amode.ImmRegRegShift offset x y shift flags)) - -;; Constant extraction rules. ;; -;; These rules attempt to find a constant within one of `x` or `y`, or deeper -;; within them if they have their own adds. These only succeed if the constant -;; itself can be represented with 32-bits and can be infallibly added to the -;; offset that we already have. +;; The primary purpose of this is to hunt for constants within the two `Value` +;; operands provided. Failing that this will defer to `amode_imm_reg` or +;; `amode_imm_reg_reg_shift` which is the final step in amode lowering and +;; performs final pattern matches related to shifts to see if that can be +;; peeled out into the amode. ;; -;; Note the recursion here where this rule is defined in terms of itself to -;; "peel" layers of constants. +;; In other words this function's job is to find constants and then defer to +;; `amode_imm_reg*`. +(decl to_amode_add (MemFlags Value Value Offset32) Amode) + +(rule 0 (to_amode_add flags x y offset) + (amode_imm_reg_reg_shift flags x y offset)) +(rule 1 (to_amode_add flags x (iconst (simm32 c)) offset) + (if-let sum (s32_add_fallible offset c)) + (amode_imm_reg flags x sum)) +(rule 2 (to_amode_add flags (iconst (simm32 c)) x offset) + (if-let sum (s32_add_fallible offset c)) + (amode_imm_reg flags x sum)) (rule 3 (to_amode_add flags (iadd x (iconst (simm32 c))) y offset) - (if-let sum (s32_add_fallible offset c)) - (to_amode_add flags x y sum)) -(rule 4 (to_amode_add flags x (iadd y (iconst (simm32 c))) offset) - (if-let sum (s32_add_fallible offset c)) - (to_amode_add flags x y sum)) -(rule 5 (to_amode_add flags x (iconst (simm32 c)) offset) - (if-let sum (s32_add_fallible offset c)) - (to_amode flags x sum)) -(rule 6 (to_amode_add flags (iconst (simm32 c)) x offset) - (if-let sum (s32_add_fallible offset c)) - (to_amode flags x sum)) + (if-let sum (s32_add_fallible offset c)) + (amode_imm_reg_reg_shift flags x y sum)) +(rule 4 (to_amode_add flags (iadd (iconst (simm32 c)) x) y offset) + (if-let sum (s32_add_fallible offset c)) + (amode_imm_reg_reg_shift flags x y sum)) +(rule 5 (to_amode_add flags x (iadd y (iconst (simm32 c))) offset) + (if-let sum (s32_add_fallible offset c)) + (amode_imm_reg_reg_shift flags x y sum)) +(rule 6 (to_amode_add flags x (iadd (iconst (simm32 c)) y) offset) + (if-let sum (s32_add_fallible offset c)) + (amode_imm_reg_reg_shift flags x y sum)) + +;; Final cases of amode lowering. Does not hunt for constants and only attempts +;; to pattern match add-of-shifts to generate fancier `ImmRegRegShift` modes, +;; otherwise falls back on `ImmReg`. +(decl amode_imm_reg (MemFlags Value Offset32) Amode) +(rule 0 (amode_imm_reg flags base offset) + (Amode.ImmReg offset base flags)) +(rule 1 (amode_imm_reg flags (iadd x y) offset) + (amode_imm_reg_reg_shift flags x y offset)) + +(decl amode_imm_reg_reg_shift (MemFlags Value Value Offset32) Amode) +(rule 0 (amode_imm_reg_reg_shift flags x y offset) + (Amode.ImmRegRegShift offset x y 0 flags)) ;; 0 == y<<0 == "no shift" +(rule 1 (amode_imm_reg_reg_shift flags x (ishl y (iconst (uimm8 shift))) offset) + (if (u32_lteq (u8_as_u32 shift) 3)) + (Amode.ImmRegRegShift offset x y shift flags)) +(rule 2 (amode_imm_reg_reg_shift flags (ishl y (iconst (uimm8 shift))) x offset) + (if (u32_lteq (u8_as_u32 shift) 3)) + (Amode.ImmRegRegShift offset x y shift flags)) ;; Offsetting an Amode. Used when we need to do consecutive ;; loads/stores to adjacent addresses. diff --git a/tests/all/module.rs b/tests/all/module.rs index 985f58989b7c..9efedb94bde6 100644 --- a/tests/all/module.rs +++ b/tests/all/module.rs @@ -216,3 +216,26 @@ fn missing_sse_and_floats_still_works() -> Result<()> { Ok(()) } + +#[test] +#[cfg_attr(miri, ignore)] +fn large_add_chain_no_stack_overflow() -> Result<()> { + let mut config = Config::new(); + config.cranelift_opt_level(OptLevel::None); + let engine = Engine::new(&config)?; + let mut wat = String::from( + " + (module + (func (result i64) + (i64.const 1) + ", + ); + for _ in 0..20_000 { + wat.push_str("(i64.add (i64.const 1))\n"); + } + + wat.push_str(")\n)"); + Module::new(&engine, &wat)?; + + Ok(()) +} From 4d31324e9e7079dd9cb717aa790c26b6494a630d Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 11 Sep 2023 09:39:52 -0700 Subject: [PATCH 08/14] Finish release notes for 13.0.0 (#6969) * Finish release notes for 13.0.0 Filling out everything that happened since #6929 * Inline MSRV policy * More notes --- RELEASES.md | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/RELEASES.md b/RELEASES.md index 8c2dbe14dddd..2ada87ca8ed0 100644 --- a/RELEASES.md +++ b/RELEASES.md @@ -20,9 +20,10 @@ Unreleased. instead of at compile time. [#6807](https://github.com/bytecodealliance/wasmtime/pull/6807) -* `Engine::detect_precompiled` can be used to to determine whether some bytes - look like a precompiled module or a component. +* `Engine::detect_precompiled{,_file}` can be used to to determine whether some + bytes or a file look like a precompiled module or a component. [#6832](https://github.com/bytecodealliance/wasmtime/pull/6832) + [#6937](https://github.com/bytecodealliance/wasmtime/pull/6937) * A new feature "wmemcheck" has been added to enable Valgrind-like detection of use-after-free within a WebAssembly guest module. @@ -38,9 +39,25 @@ Unreleased. * Wasmtime's implementation of the wasi-nn proposal now supports named models. [#6854](https://github.com/bytecodealliance/wasmtime/pull/6854) -* The C API now supports configuring `native_unwind_info` and - `dynamic_memory_reserved_for_growth`. +* The C API now supports configuring `native_unwind_info`, + `dynamic_memory_reserved_for_growth`, `target`, and Cranelift settings. [#6896](https://github.com/bytecodealliance/wasmtime/pull/6896) + [#6934](https://github.com/bytecodealliance/wasmtime/pull/6934) + +* The `wasmtime` crate now has initial support for component model bindings + generation for the WIT `resource` type. + [#6886](https://github.com/bytecodealliance/wasmtime/pull/6886) + +* Cranelift's RISC-V backend now has a complete implementation of the + WebAssembly SIMD proposal. Many thanks to Afonso Bordado for all their + contributions! + [#6920](https://github.com/bytecodealliance/wasmtime/pull/6920) + [#6924](https://github.com/bytecodealliance/wasmtime/pull/6924) + +* The `bindgen!` macro in the `wasmtime` crate now supports conditional + configuration for which imports should be `async` and which should be + synchronous. + [#6942](https://github.com/bytecodealliance/wasmtime/pull/6942) ### Changed @@ -90,13 +107,23 @@ Unreleased. These methods do not affect the size of the pre-allocated pool. [#6835](https://github.com/bytecodealliance/wasmtime/pull/6835) -* Builder methods for WASI contexts onw use `&mut self` instead of `self`. +* Builder methods for WASI contexts now use `&mut self` instead of `self`. [#6770](https://github.com/bytecodealliance/wasmtime/pull/6770) * Native unwinding information is now properly disabled when it is configured to be turned off. [#6547](https://github.com/bytecodealliance/wasmtime/pull/6547) +* Wasmtime's minimum supported Rust version (MSRV) is now 1.70.0. Wasmtime's + MSRV policy of supporting the last three releases of Rust (N-2) is now + additionally documented. More discussion can additionally be found on the PR + itself. + [#6900](https://github.com/bytecodealliance/wasmtime/pull/6900) + +* Wasmtime's support for DWARF debugging information has seen some fixes for + previously reported crashes. + [#6931](https://github.com/bytecodealliance/wasmtime/pull/6931) + ### Removed * Wasmtime's experimental implementation of wasi-crypto has been removed. More From 0ee6641151e59e8303182dc704782441abad1105 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 11 Sep 2023 09:56:12 -0700 Subject: [PATCH 09/14] Bring some of Wasmtime's documentation more up-to-date (#6994) * Remove tutorial/wasm-writing documentation This commit removes the tutorial and "Writing WebAssembly" documentation sections from Wasmtime's documentation. These sections are quite dated at this point (they still recommend `cargo wasi`!) and haven't been updated much since their inception, especially in the arena of components. Today it seems best to leave this sort of documentation to [other resources] which are more tailored towards documentation of writing wasm. [other resources]: https://component-model.bytecodealliance.org/ * Remove markdown example from docs Like the previous commit this is quite dated and recommends effectively-deprecated tooling and doesn't take into account components. * Update our intro docs a bit * Link security blog post from docs * Fold docs of wasm proposals into tier docs This was already a bit duplicated so consolidate into one location. Additionally add some proposals that weren't previously documented and move some around based on their implementation status. * Document unsupported features In an effort to head off questions about platform support I figure it might be a good idea to start documenting what's not supported at this time. This is intended to mirror the current state, not future, of Wasmtime. In other words this should answer the question of "Does Wasmtime support X?" as opposed to "Does Wasmtime want to support X?" since we want to eventually support all of these features in the limit. * Fold WASI docs into tier docs Similar to the previous commit but for WASI proposals. --- docs/SUMMARY.md | 11 -- docs/examples-markdown.md | 65 --------- docs/introduction.md | 17 +-- docs/security.md | 4 +- docs/stability-tiers.md | 97 +++++++++++- docs/stability-wasi-proposals-support.md | 60 -------- docs/stability-wasm-proposals-support.md | 43 ------ docs/tutorial-create-hello-world.md | 63 -------- docs/tutorial-run-hello-world.md | 28 ---- docs/tutorial.md | 7 - docs/wasm-assemblyscript.md | 77 ---------- docs/wasm-c.md | 28 ---- docs/wasm-rust.md | 178 ----------------------- docs/wasm-wat.md | 56 ------- docs/wasm.md | 13 -- 15 files changed, 100 insertions(+), 647 deletions(-) delete mode 100644 docs/examples-markdown.md delete mode 100644 docs/stability-wasi-proposals-support.md delete mode 100644 docs/stability-wasm-proposals-support.md delete mode 100644 docs/tutorial-create-hello-world.md delete mode 100644 docs/tutorial-run-hello-world.md delete mode 100644 docs/tutorial.md delete mode 100644 docs/wasm-assemblyscript.md delete mode 100644 docs/wasm-c.md delete mode 100644 docs/wasm-rust.md delete mode 100644 docs/wasm-wat.md delete mode 100644 docs/wasm.md diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index 1d247417773c..21c1061f2031 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -1,11 +1,7 @@ # Summary - [Introduction](./introduction.md) -- [Tutorial](./tutorial.md) - - [Creating `hello-world.wasm`](./tutorial-create-hello-world.md) - - [Running `hello-world.wasm`](./tutorial-run-hello-world.md) - [Examples](./examples.md) - - [Markdown Parser](./examples-markdown.md) - [Debugging WebAssembly](./examples-debugging.md) - [Profiling WebAssembly](./examples-profiling.md) - [Profiling with Perf](./examples-profiling-perf.md) @@ -42,17 +38,10 @@ - [CLI Options](./cli-options.md) - [CLI Logging](./cli-logging.md) - [Cache Configuration](./cli-cache.md) -- [Writing WebAssembly](./wasm.md) - - [Rust](./wasm-rust.md) - - [C/C++](./wasm-c.md) - - [AssemblyScript](./wasm-assemblyscript.md) - - [WebAssembly Text Format (`*.wat`)](./wasm-wat.md) - [Stability](stability.md) - [Release Process](./stability-release.md) - [Tiers of support](./stability-tiers.md) - [Platform Support](./stability-platform-support.md) - - [Wasm Proposals Support](./stability-wasm-proposals-support.md) - - [WASI Proposals Support](./stability-wasi-proposals-support.md) - [Security](security.md) - [Disclosure Policy](./security-disclosure.md) - [Contributing](contributing.md) diff --git a/docs/examples-markdown.md b/docs/examples-markdown.md deleted file mode 100644 index 36b0d63d546e..000000000000 --- a/docs/examples-markdown.md +++ /dev/null @@ -1,65 +0,0 @@ -# Markdown Parser - -The following steps describe an implementation of a WASI markdown parser, in Rust, using [pulldown-cmark](https://github.com/raphlinus/pulldown-cmark). - -First, we will generate a new executable with cargo: - -```bash -cargo new --bin rust_wasi_markdown_parser -cd rust_wasi_markdown_parser -``` - -Also, we need to add the `structopt` and `pulldown_cmark` crates to our project: - -```bash -cargo add structopt pulldown_cmark -``` - -Then, we will open the `src/main.rs` and enter the following contents. Please see the comments to understand what our program will be doing. - -## `src/main.rs` - -```rust,ignore -{{#include ./rust_wasi_markdown_parser/src/main.rs}} -``` - -Next, we will want to add WASI as a target that we can compile to. We will ask the rustup tool to install support for WASI. Then, we will compile our program to WASI. To do this we will run: - -```bash -rustup target add wasm32-wasi -cargo build --target wasm32-wasi -``` - -Our wasm file should be compiled to `target/wasm32-wasi/debug/rust_wasi_markdown_parser.wasm`. It is worth noting that even though the WASI APIs are not being used directly, when we compile our program to target WASI, the rust APIs and standard library will be using these WASI APIs under the hood for us! Now that we have our program compiled to target WASI, let's run our program! - -To do this, we can use the Wasmtime CLI. However, there is one thing to note about Wasmtime, WASI, and the capability based security model. We need to give our program explicit access to read files on our device. Wasm modules that implement WASI will not have this capability unless we give them the capability. - -To grant the capability to read in a directory using the Wasmtime CLI, we need to use the --dir flag. --dir will instruct wasmtime to make the passed directory available to access files from. (You can also `--mapdir GUEST_DIRECTORY::HOST_DIRECTORY` to make it available under a different path inside the content.) For example: - -```bash -wasmtime --dir . my-wasi-program.wasm -``` - -For this example, we will be passing a markdown file to our program called: `example_markdown.md`, that will exist in whatever our current directory (`./`) is. Our markdown file, `example_markdown.md`, will contain: - -```md -# Hello! - -I am example markdown for this demo! -``` - -So, **to run our compiled WASI program, we will run**: - -```bash -wasmtime --dir . target/wasm32-wasi/debug/rust_wasi_markdown_parser.wasm -- ./example_markdown.md -``` - -Which should look like the following: - -```html -

Hello!

-

I am example markdown for this demo!

-``` - -Hooray! We were able to write a Wasm Module, that uses WASI to read a markdown file, parse the markdown, and write the output to stdout! Continue reading to see more examples of using Wasmtime to execute Wasm Modules, from the CLI or even embedded in your application! - diff --git a/docs/introduction.md b/docs/introduction.md index 904d8638c858..207285c71866 100644 --- a/docs/introduction.md +++ b/docs/introduction.md @@ -1,21 +1,19 @@ # Introduction [Wasmtime][github] is a [Bytecode Alliance][BA] project that is a standalone -wasm-only optimizing runtime for [WebAssembly] and [WASI]. It runs WebAssembly -code [outside of the Web], and can be used both as a command-line utility or as -a library embedded in a larger application. +optimizing runtime for [WebAssembly], [the Component Model], and [WASI]. It runs +WebAssembly code [outside of the Web], and can be used both as a command-line +utility or as a library embedded in a larger application. Wasmtime strives to be +a highly configurable and embeddable runtime to run on any scale of application. -Wasmtime strives to be a highly configurable and embeddable runtime to run on -any scale of application. Many features are still under development so if you -have a question don't hesitate to [file an issue][issue]. +This documentation is intended to serve a number of purposes and within you'll +find: -This guide is intended to serve a number of purposes and within you'll find: - -* [How to create simple wasm modules](tutorial-create-hello-world.md) * [How to use Wasmtime from a number of languages](lang.md) * [How to install and use the `wasmtime` CLI](cli.md) * Information about [stability](stability.md) and [security](security.md) in Wasmtime. +* Documentation about [contributing](contributing.md) to Wasmtime. ... and more! The source for this guide [lives on GitHub](https://github.com/bytecodealliance/wasmtime/tree/main/docs) and @@ -27,3 +25,4 @@ contributions are welcome! [WASI]: https://wasi.dev [outside of the Web]: https://webassembly.org/docs/non-web/ [issue]: https://github.com/bytecodealliance/wasmtime/issues/new +[the Component Model]: https://github.com/WebAssembly/component-model diff --git a/docs/security.md b/docs/security.md index b36284a4ad6b..8eeea3a9beb8 100644 --- a/docs/security.md +++ b/docs/security.md @@ -4,7 +4,9 @@ One of WebAssembly (and Wasmtime's) main goals is to execute untrusted code in a safe manner inside of a sandbox. WebAssembly is inherently sandboxed by design (must import all functionality, etc). This document is intended to cover the various sandboxing implementation strategies that Wasmtime has as they are -developed. +developed. This has also been documented in a [historical blog post] too. + +[historical blog post]: https://bytecodealliance.org/articles/security-and-correctness-in-wasmtime At this time Wasmtime implements what's necessary for the WebAssembly specification, for example memory isolation between instances. Additionally the diff --git a/docs/stability-tiers.md b/docs/stability-tiers.md index d67c946cd277..19c052919241 100644 --- a/docs/stability-tiers.md +++ b/docs/stability-tiers.md @@ -27,9 +27,21 @@ For explanations of what each tier means see below. | Target | `x86_64-unknown-linux-gnu` | | WASI Proposal | `wasi_snapshot_preview1` | | WASI Proposal | `wasi_unstable` | -| WebAssembly Proposal | `bulk-memory` | -| WebAssembly Proposal | `reference-types` | -| WebAssembly Proposal | `simd` | +| WebAssembly Proposal | [`mutable-globals`] | +| WebAssembly Proposal | [`sign-extension-ops`] | +| WebAssembly Proposal | [`nontrapping-float-to-int-conversion`] | +| WebAssembly Proposal | [`multi-value`] | +| WebAssembly Proposal | [`bulk-memory`] | +| WebAssembly Proposal | [`reference-types`] | +| WebAssembly Proposal | [`simd`] | + +[`mutable-globals`]: https://github.com/WebAssembly/mutable-global/blob/master/proposals/mutable-global/Overview.md +[`sign-extension-ops`]: https://github.com/WebAssembly/spec/blob/master/proposals/sign-extension-ops/Overview.md +[`nontrapping-float-to-int-conversion`]: https://github.com/WebAssembly/spec/blob/master/proposals/nontrapping-float-to-int-conversion/Overview.md +[`multi-value`]: https://github.com/WebAssembly/spec/blob/master/proposals/multi-value/Overview.md +[`bulk-memory`]: https://github.com/WebAssembly/bulk-memory-operations/blob/master/proposals/bulk-memory-operations/Overview.md +[`reference-types`]: https://github.com/WebAssembly/reference-types/blob/master/proposals/reference-types/Overview.md +[`simd`]: https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md #### Tier 2 @@ -38,8 +50,31 @@ For explanations of what each tier means see below. | Target | `aarch64-unknown-linux-gnu`| Continuous fuzzing | | Target | `s390x-unknown-linux-gnu` | Continuous fuzzing | | Target | `x86_64-pc-windows-gnu` | Clear owner of the target | -| WebAssembly Proposal | `memory64` | Unstable wasm proposal | -| WebAssembly Proposal | `multi-memory` | Unstable wasm proposal | +| WebAssembly Proposal | [`memory64`]] | Unstable wasm proposal | +| WebAssembly Proposal | [`multi-memory`] | Unstable wasm proposal | +| WebAssembly Proposal | [`threads`] | Unstable wasm proposal | +| WebAssembly Proposal | [`component-model`] | Unstable wasm proposal | +| WebAssembly Proposal | [`tail-call`] | Unstable wasm proposal, performance work | +| WebAssembly Proposal | [`relaxed-simd`] | Unstable wasm proposal | +| WebAssembly Proposal | [`function-references`] | Unstable wasm proposal | +| WASI Proposal | [`wasi-io`] | Unstable WASI proposal | +| WASI Proposal | [`wasi-clocks`] | Unstable WASI proposal | +| WASI Proposal | [`wasi-filesystem`] | Unstable WASI proposal | +| WASI Proposal | [`wasi-random`] | Unstable WASI proposal | +| WASI Proposal | [`wasi-poll`] | Unstable WASI proposal | + +[`memory64`]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md +[`multi-memory`]: https://github.com/WebAssembly/multi-memory/blob/master/proposals/multi-memory/Overview.md +[`threads`]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md +[`component-model`]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Explainer.md +[`tail-call`]: https://github.com/WebAssembly/tail-call/blob/main/proposals/tail-call/Overview.md +[`relaxed-simd`]: https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md +[`function-references`]: https://github.com/WebAssembly/function-references/blob/main/proposals/function-references/Overview.md +[`wasi-clocks`]: https://github.com/WebAssembly/wasi-clocks +[`wasi-filesystem`]: https://github.com/WebAssembly/wasi-filesystem +[`wasi-io`]: https://github.com/WebAssembly/wasi-io +[`wasi-random`]: https://github.com/WebAssembly/wasi-random +[`wasi-poll`]: https://github.com/WebAssembly/wasi-poll #### Tier 3 @@ -48,12 +83,18 @@ For explanations of what each tier means see below. | Target | `aarch64-apple-darwin` | CI testing | | Target | `aarch64-pc-windows-msvc` | CI testing, unwinding, full-time maintainer | | Target | `riscv64gc-unknown-linux-gnu` | full-time maintainer | -| WASI Proposal | `wasi-nn` | More expansive CI testing | -| WebAssembly Proposal | `threads` | Complete implementation | -| WebAssembly Proposal | `component-model` | Complete implementation | +| WASI Proposal | [`wasi-nn`] | More expansive CI testing | +| WASI Proposal | [`wasi-threads`] | More CI, unstable proposal | +| WASI Proposal | [`wasi-sockets`] | Complete implementation | +| WASI Proposal | [`wasi-http`] | Complete implementation | | *misc* | Non-Wasmtime Cranelift usage [^1] | CI testing, full-time maintainer | | *misc* | DWARF debugging [^2] | CI testing, full-time maintainer, improved quality | +[`wasi-sockets`]: https://github.com/WebAssembly/wasi-sockets +[`wasi-nn`]: https://github.com/WebAssembly/wasi-nn +[`wasi-threads`]: https://github.com/WebAssembly/wasi-threads +[`wasi-http`]: https://github.com/WebAssembly/wasi-http + [^1]: This is intended to encompass features that Cranelift supports as a general-purpose code generator such as integer value types other than `i32` and `i64`, non-Wasmtime calling conventions, code model settings, relocation @@ -67,6 +108,46 @@ support is currently best-effort. Additionally there are known shortcomings and bugs. At this time there's no developer time to improve the situation here as well. +#### Unsupported features and platforms + +While this is not an exhaustive list, Wasmtime does not currently have support +for the following features. Note that this is intended to document Wasmtime's +current state and does not mean Wasmtime does not want to ever support these +features; rather design discussion and PRs are welcome for many of the below +features to figure out how best to implement them and at least move them to Tier +3 above. + +* Target: ARM 32-bit +* Target: WebAssembly (compiling Wasmtime to WebAssembly itself) +* Target: [FreeBSD](https://github.com/bytecodealliance/wasmtime/issues/5499) +* Target: [NetBSD/OpenBSD](https://github.com/bytecodealliance/wasmtime/issues/6962) +* Target: [i686 (32-bit Intel targets)](https://github.com/bytecodealliance/wasmtime/issues/1980) +* Target: Android +* Target: MIPS +* Target: SPARC +* Target: PowerPC +* Target: RISC-V 32-bit +* [WebAssembly proposal: `branch-hinting`](https://github.com/WebAssembly/branch-hinting) +* [WebAssembly proposal: `exception-handling`](https://github.com/WebAssembly/exception-handling) +* [WebAssembly proposal: `extended-const`](https://github.com/WebAssembly/extended-const) +* [WebAssembly proposal: `flexible-vectors`](https://github.com/WebAssembly/flexible-vectors) +* [WebAssembly proposal: `gc`](https://github.com/WebAssembly/gc) +* [WebAssembly proposal: `memory-control`](https://github.com/WebAssembly/memory-control) +* [WebAssembly proposal: `stack-switching`](https://github.com/WebAssembly/stack-switching) +* [WASI proposal: `proxy-wasm`](https://github.com/proxy-wasm/spec) +* [WASI proposal: `wasi-blob-store`](https://github.com/WebAssembly/wasi-blob-store) +* [WASI proposal: `wasi-crypto`](https://github.com/WebAssembly/wasi-crypto) +* [WASI proposal: `wasi-data`](https://github.com/WebAssembly/wasi-data) +* [WASI proposal: `wasi-distributed-lock-service`](https://github.com/WebAssembly/wasi-distributed-lock-service) +* [WASI proposal: `wasi-grpc`](https://github.com/WebAssembly/wasi-grpc) +* [WASI proposal: `wasi-kv-store`](https://github.com/WebAssembly/wasi-kv-store) +* [WASI proposal: `wasi-message-queue`](https://github.com/WebAssembly/wasi-message-queue) +* [WASI proposal: `wasi-parallel`](https://github.com/WebAssembly/wasi-parallel) +* [WASI proposal: `wasi-pubsub`](https://github.com/WebAssembly/wasi-pubsub) +* [WASI proposal: `wasi-runtime-config`](https://github.com/WebAssembly/wasi-runtime-config) +* [WASI proposal: `wasi-sql`](https://github.com/WebAssembly/wasi-sql) +* [WASI proposal: `wasi-url`](https://github.com/WebAssembly/wasi-url) + ## Tier Details Wasmtime's precise definitions of tiers are not guaranteed to be constant over diff --git a/docs/stability-wasi-proposals-support.md b/docs/stability-wasi-proposals-support.md deleted file mode 100644 index a3964bf5ebc5..000000000000 --- a/docs/stability-wasi-proposals-support.md +++ /dev/null @@ -1,60 +0,0 @@ -# WASI Proposals Support - -The following table summarizes Wasmtime's support for WASI [proposals]. If a -proposal is not listed, then it is not supported by Wasmtime. - -[proposals]: https://github.com/WebAssembly/WASI/blob/main/Proposals.md - -| WASI Proposal | Supported in Wasmtime? | Enabled by default? | CLI Flag Name [^cli] | -|----------------------------------------|-------------------------|----------------------|-----------------------------| -| [I/O][wasi-io] | **Yes** | **Yes** | `wasi-common` | -| [Filesystem][wasi-filesystem] | **Yes** | **Yes** | `wasi-common` | -| [Clocks][wasi-clocks] | **Yes** | **Yes** | `wasi-common` | -| [Random][wasi-random] | **Yes** | **Yes** | `wasi-common` | -| [Poll][wasi-poll] | **Yes** | **Yes** | `wasi-common` | -| [Machine Learning (wasi-nn)][wasi-nn] | **Yes** | No | `experimental-wasi-nn` | -| [Blob Store][wasi-blob-store] | No | No | N/A | -| [Crypto][wasi-crypto] | No | No | N/A | -| [Distributed Lock Service][wasi-distributed-lock-service] | No | No | N/A | -| [gRPC][wasi-grpc] | No | No | N/A | -| [HTTP][wasi-http] | No | No | N/A | -| [Key-value Store][wasi-kv-store] | No | No | N/A | -| [Message Queue][wasi-message-queue] | No | No | N/A | -| [Parallel][wasi-parallel] | No (see [#4949]) | No | N/A | -| [Pub/sub][wasi-pubsub] | No | No | N/A | -| [Runtime Config][wasi-runtime-config] | No | No | N/A | -| [Sockets][wasi-sockets] | No | No | N/A | -| [SQL][wasi-sql] | No | No | N/A | -| [Threads][wasi-threads] | **Yes** | No | `experimental-wasi-threads` | - -[^cli]: The CLI flag name refers to to the `--wasi-modules` argument of the - `wasmtime` executable; e.g., `--wasi-modules=wasi-crypto`. See `wasmtime run - --help` for more information on the flag's default value and configuration. -[^crypto]: Build Wasmtime with `--features=wasi-crypto` to enable this. - -[#4949]: https://github.com/bytecodealliance/wasmtime/pull/4949 -[wasi-blob-store]: https://github.com/WebAssembly/wasi-blob-store -[wasi-clocks]: https://github.com/WebAssembly/wasi-clocks -[wasi-classic-command]: https://github.com/WebAssembly/wasi-classic-command -[wasi-crypto]: https://github.com/WebAssembly/wasi-crypto -[wasi-data]: https://github.com/singlestore-labs/wasi-data -[wasi-distributed-lock-service]: https://github.com/WebAssembly/wasi-distributed-lock-service -[wasi-filesystem]: https://github.com/WebAssembly/wasi-filesystem -[wasi-grpc]: https://github.com/WebAssembly/wasi-grpc -[wasi-handle-index]: https://github.com/WebAssembly/wasi-handle-index -[wasi-http]: https://github.com/WebAssembly/wasi-http -[wasi-io]: https://github.com/WebAssembly/wasi-io -[wasi-kv-store]: https://github.com/WebAssembly/wasi-kv-store -[wasi-message-queue]: https://github.com/WebAssembly/wasi-message-queue -[wasi-misc]: https://github.com/WebAssembly/wasi-misc -[wasi-threads]: https://github.com/WebAssembly/wasi-native-threads -[wasi-nn]: https://github.com/WebAssembly/wasi-nn -[wasi-random]: https://github.com/WebAssembly/wasi-random -[wasi-parallel]: https://github.com/WebAssembly/wasi-parallel -[wasi-poll]: https://github.com/WebAssembly/wasi-poll -[wasi-proxy-wasm]: https://github.com/proxy-wasm/spec -[wasi-pubsub]: https://github.com/WebAssembly/wasi-pubsub -[wasi-runtime-config]: https://github.com/WebAssembly/wasi-runtime-config -[wasi-sockets]: https://github.com/WebAssembly/wasi-sockets -[wasi-sql]: https://github.com/WebAssembly/wasi-sql -[wasi-url]: https://github.com/WebAssembly/wasi-url diff --git a/docs/stability-wasm-proposals-support.md b/docs/stability-wasm-proposals-support.md deleted file mode 100644 index 9d56a72fa32b..000000000000 --- a/docs/stability-wasm-proposals-support.md +++ /dev/null @@ -1,43 +0,0 @@ -# WebAssembly Proposals Support - -The following table summarizes Wasmtime's support for WebAssembly proposals as -well as the command line flag and [`wasmtime::Config`][config] method you can -use to enable or disable support for a proposal. - -If a proposal is not listed, then it is not supported by Wasmtime. - -Wasmtime will never enable a proposal by default unless it has reached phase 4 -of [the WebAssembly standardizations process][phases] and its implementation in -Wasmtime has been [thoroughly -vetted](./contributing-implementing-wasm-proposals.html). - -| WebAssembly Proposal | Supported in Wasmtime? | Command Line Name | [`Config`][config] Method | -|---------------------------------------------|----------------------------------|--------------------|---------------------------| -| **[Import and Export Mutable Globals]** | **Yes.**
Always enabled. | (none) | (none) | -| **[Sign-Extension Operations]** | **Yes.**
Always enabled. | (none) | (none) | -| **[Non-Trapping Float-to-Int Conversions]** | **Yes.**
Always enabled. | (none) | (none) | -| **[Multi-Value]** | **Yes.**
Enabled by default. | `multi-value` | [`wasm_multi_value`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_multi_value) | -| **[Bulk Memory Operations]** | **Yes.**
Enabled by default. | `bulk-memory` | [`wasm_bulk_memory`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_bulk_memory) | -| **[Reference Types]** | **Yes.**
Enabled by default. | `reference-types` | [`wasm_reference_types`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_reference_types) | -| **[Fixed-Width SIMD]** | **Yes.**
Enabled by default. | `simd` | [`wasm_simd`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_simd) | -| **[Threads and Atomics]** | **Yes.** | `threads` | [`wasm_threads`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_threads) | -| **[Multi-Memory]** | **Yes.** | `multi-memory` | [`wasm_multi_memory`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_multi_memory) | -| **[Component Model]** | **In progress.** | `component-model` | [`wasm_component_model`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_component_model) | -| **[Memory64]** | **Yes.** | `memory64` | [`wasm_memory64`](https://docs.rs/wasmtime/*/wasmtime/struct.Config.html#method.wasm_memory64) | - -The "Command Line Name" refers to the `--wasm-features` CLI argument of the -`wasmtime` executable and the name which must be passed to enable it. - -[config]: https://docs.rs/wasmtime/*/wasmtime/struct.Config.html -[Multi-Value]: https://github.com/WebAssembly/spec/blob/master/proposals/multi-value/Overview.md -[Bulk Memory Operations]: https://github.com/WebAssembly/bulk-memory-operations/blob/master/proposals/bulk-memory-operations/Overview.md -[Import and Export Mutable Globals]: https://github.com/WebAssembly/mutable-global/blob/master/proposals/mutable-global/Overview.md -[Reference Types]: https://github.com/WebAssembly/reference-types/blob/master/proposals/reference-types/Overview.md -[Non-Trapping Float-to-Int Conversions]: https://github.com/WebAssembly/spec/blob/master/proposals/nontrapping-float-to-int-conversion/Overview.md -[Sign-Extension Operations]: https://github.com/WebAssembly/spec/blob/master/proposals/sign-extension-ops/Overview.md -[Fixed-Width SIMD]: https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md -[phases]: https://github.com/WebAssembly/meetings/blob/master/process/phases.md -[Threads and Atomics]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md -[Multi-Memory]: https://github.com/WebAssembly/multi-memory/blob/master/proposals/multi-memory/Overview.md -[Component Model]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Explainer.md -[Memory64]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md diff --git a/docs/tutorial-create-hello-world.md b/docs/tutorial-create-hello-world.md deleted file mode 100644 index 6f2d9cb6da66..000000000000 --- a/docs/tutorial-create-hello-world.md +++ /dev/null @@ -1,63 +0,0 @@ -# Creating `hello-world.wasm` - -There are a number of ways to create `.wasm` files but for the purposes of this -tutorial, we'll be using the Rust toolchain. You can find more information on -creating `.wasm` files from other languages in the -[Writing WebAssembly section](./wasm.md). - -To build WebAssembly binaries with Rust, you'll need the standard Rust toolchain. - -[Follow these instructions to install `rustc`, `rustup` and `cargo`](https://www.rust-lang.org/tools/install) - -Next, you should add WebAssembly as a build target for cargo like so: - -```sh -$ rustup target add wasm32-wasi -``` - -Finally, create a new Rust project called 'hello-world'. You can do this by running: - -```sh -$ cargo new hello-world -``` - -After that, the hello-world folder should look like this. - -```text -hello-world/ -├── Cargo.lock -├── Cargo.toml -└── src - └── main.rs -``` - -And the `main.rs` file inside the `src` folder should contain the following rust code. - -```rust -fn main() { - println!("Hello, world!"); -} - -``` - -Now, we can tell `cargo` to build a WebAssembly file: - -```sh -$ cargo build --target wasm32-wasi -``` - -Now, in the `target` folder, there's a `hello-world.wasm` file. You can find it here: - -```text -hello-world/ -├── Cargo.lock -├── Cargo.toml -├── src -└── target - └── ... - └── wasm32-wasi - └── debug - └── ... - └── hello-world.wasm - -``` diff --git a/docs/tutorial-run-hello-world.md b/docs/tutorial-run-hello-world.md deleted file mode 100644 index 770f14fce7e6..000000000000 --- a/docs/tutorial-run-hello-world.md +++ /dev/null @@ -1,28 +0,0 @@ -# Running `hello-world.wasm` with Wasmtime - -## Installing Wasmtime - -The Wasmtime CLI can be installed on Linux and macOS with a small install -script: - -```sh -$ curl https://wasmtime.dev/install.sh -sSf | bash -``` - -You can find more information about installing the Wasmtime CLI in the -[CLI Installation section](./cli-install.md) - -## Running `hello-world.wasm` - -There are a number of ways to run a `.wasm` file with Wasmtime. In this -tutorial, we'll be using the CLI, Wasmtime can also be embedded in your -applications. More information on this can be found in the -[Embedding Wasmtime section](./lang.md). - -If you've built the `hello-world.wasm` file (the instructions for doing so are in the -[previous section](./tutorial-create-hello-world.md)), -you can run it with Wasmtime from the command line like so: - -```sh -$ wasmtime target/wasm32-wasi/debug/hello-world.wasm -``` diff --git a/docs/tutorial.md b/docs/tutorial.md deleted file mode 100644 index dc19ed194691..000000000000 --- a/docs/tutorial.md +++ /dev/null @@ -1,7 +0,0 @@ -# Tutorial - -This tutorial walks through creating a simple Hello World WebAssembly program -and then running it. - -* [Creating `hello-world.wasm`](tutorial-create-hello-world.md) -* [Running `hello-world.wasm`](tutorial-run-hello-world.md) diff --git a/docs/wasm-assemblyscript.md b/docs/wasm-assemblyscript.md deleted file mode 100644 index 527b7828f238..000000000000 --- a/docs/wasm-assemblyscript.md +++ /dev/null @@ -1,77 +0,0 @@ -# AssemblyScript - -[AssemblyScript] has included support for targeting WASI since version 0.10.0. If you're not familiar with AssemblyScript, check out the [docs](https://www.assemblyscript.org/introduction.html) or the [discord server](https://discord.gg/assemblyscript). -To setup this demo, you need a valid installation of [NodeJS](https://nodejs.org/), [Deno](https://deno.com/runtime), or [Bun](https://bun.sh/) along with a installation of [wasmtime](https://github.com/bytecodealliance/wasmtime). - -For the rest of this documentation, we'll default to NPM as our package manager. Feel free to use the manager of your choice. - -Let's walk through a simple hello world example using the latest AssemblyScript runtime (at the time of this writing, it is AssemblyScript runtime included in version 0.27.x): - -## Hello World! - -Enabling WASI support in AssemblyScript requires some configuration and dependencies in order to compile with WASI support. - -First, we'll install [assemblyscript](https://github.com/AssemblyScript/AssemblyScript) along with [wasi-shim](https://github.com/AssemblyScript/wasi-shim) which is a plugin that adds support for WASI. - -```sh -$ npm install --save-dev assemblyscript @assemblyscript/wasi-shim -``` - -Next, we'll use the built in `asinit` command to create our project files. When prompted, type `y` and return. - -```sh -$ npx asinit . -``` - -Next, we need to configure our project to use WASI as a build target. Navigate to `asconfig.json` and add the following line. - -`asconfig.json` -```json -{ - // "targets": { ... }, - "extends": "./node_modules/@assemblyscript/wasi-shim/asconfig.json" -} -``` - -With AssemblyScript now configured to use WASI, we can enter `./assembly/index.ts` and change it to the following. This will tell WASI to print the string to the terminal. - -`assembly/index.ts` -```js -console.log("Hello World!"); -``` - -Now, compile our WASI module using the `asc` command and run it using `wasmtime`. - -```sh -$ npx asc assembly/index.ts --target release -$ wasmtime ./build/release.wasm -``` - -Now that we know how to use WASI, we'll test the capabilities of WASI using a demo. - -## WASI Demo - -First, clone the [wasmtime](https://github.com/bytecodealliance/wasmtime) repository and navigate to the `docs/assemblyscript_demo` directory. - -```sh -$ git clone https://github.com/bytecodealliance/wasmtime -$ cd wasmtime/docs/assemblyscript_demo -``` - -Install our dependencies with NPM or your preferred package manager. - -```sh -$ npm install -``` - -Take a look at the code in `docs/assemblyscript_demo/wasi-demo.ts` and then build the WASM/WASI binary by running - -```sh -$ npx asc wasi-demo.ts --target wasi-demo -``` - -Lastly, run the demo using - -```sh -$ wasmtime ./build/wasi-demo.wasm -``` diff --git a/docs/wasm-c.md b/docs/wasm-c.md deleted file mode 100644 index d4c05d6a94b9..000000000000 --- a/docs/wasm-c.md +++ /dev/null @@ -1,28 +0,0 @@ -# C/C++ - -All the parts needed to support wasm are included in upstream clang, lld, and -compiler-rt, as of the LLVM 8.0 release. However, to use it, you'll need -to build WebAssembly-targeted versions of the library parts, and it can -be tricky to get all the CMake invocations lined up properly. - -To make things easier, we provide -[prebuilt packages](https://github.com/WebAssembly/wasi-sdk/releases) -that provide builds of Clang and sysroot libraries. - -WASI doesn't yet support `setjmp`/`longjmp` or C++ exceptions, as it is -waiting for [unwinding support in WebAssembly]. - -By default, the C/C++ toolchain orders linear memory to put the globals first, -the stack second, and start the heap after that. This reduces code size, -because references to globals can use small offsets. However, it also means -that stack overflow will often lead to corrupted globals. The -`-Wl,--stack-first` flag to clang instructs it to put the stack first, followed -by the globals and the heap, which may produce slightly larger code, but will -more reliably trap on stack overflow. - -See the [wasm-ld documentation] for more information and additional flags. Note -flags related to dynamic linking, such `-shared` and `--export-dynamic` are -not yet stable and are expected to change behavior in the future. - -[unwinding support in WebAssembly]: https://github.com/WebAssembly/exception-handling/ -[wasm-ld documentation]: https://lld.llvm.org/WebAssembly.html diff --git a/docs/wasm-rust.md b/docs/wasm-rust.md deleted file mode 100644 index 1e8fddd86ced..000000000000 --- a/docs/wasm-rust.md +++ /dev/null @@ -1,178 +0,0 @@ -# Rust - -The [Rust Programming Language](https://www.rust-lang.org) supports WebAssembly -as a compilation target. If you're not familiar with Rust it's recommended to -start [with its introductory documentation](https://www.rust-lang.org/learn). -Compiling to WebAssembly will involve specifying the desired target via the -`--target` flag, and to do this there are a number of "target triples" for -WebAssembly compilation in Rust: - -* `wasm32-wasi` - when using `wasmtime` this is likely what you'll be using. The - WASI target is integrated into the standard library and is intended on - producing standalone binaries. -* `wasm32-unknown-unknown` - this target, like the WASI one, is focused on - producing single `*.wasm` binaries. The standard library, however, is largely - stubbed out since the "unknown" part of the target means libstd can't assume - anything. This means that while binaries will likely work in `wasmtime`, - common conveniences like `println!` or `panic!` won't work. -* `wasm32-unknown-emscripten` - this target is intended to work in a web browser - and produces a `*.wasm` file coupled with a `*.js` file, and it is not - compatible with `wasmtime`. - -For the rest of this documentation we'll assume that you're using the -`wasm32-wasi` target for compiling Rust code and executing inside of `wasmtime`. - -## Hello, World! - -Cross-compiling to WebAssembly involves a number of knobs that need -configuration, but you can often gloss over these internal details by using -build tooling intended for the WASI target. For example we can start out writing -a WebAssembly binary with [`cargo -wasi`](https://github.com/bytecodealliance/cargo-wasi). - -First up we'll [install `cargo -wasi`](https://bytecodealliance.github.io/cargo-wasi/install.html): - -```sh -$ cargo install cargo-wasi -``` - -Next we'll make a new Cargo project: - -```sh -$ cargo new hello-world -$ cd hello-world -``` - -Inside of `src/main.rs` you'll see the canonical Rust "Hello, World!" using -`println!`. We'll be executing this for the `wasm32-wasi` target, so you'll want -to make sure you're previously [built `wasmtime` and inserted it into -`PATH`](./cli-install.md); - -```sh -$ cargo wasi run -info: downloading component 'rust-std' for 'wasm32-wasi' -info: installing component 'rust-std' for 'wasm32-wasi' - Compiling hello-world v0.1.0 (/hello-world) - Finished dev [unoptimized + debuginfo] target(s) in 0.16s - Running `/.cargo/bin/cargo-wasi target/wasm32-wasi/debug/hello-world.wasm` - Running `target/wasm32-wasi/debug/hello-world.wasm` -Hello, world! -``` - -And we're already running our first WebAssembly code inside of `wasmtime`! - -While it's automatically happening for you as part of `cargo wasi`, you can also -run `wasmtime` yourself: - -```sh -$ wasmtime target/wasm32-wasi/debug/hello-world.wasm -Hello, world! -``` - -You can check out the [introductory documentation of -`cargo-wasi`](https://bytecodealliance.github.io/cargo-wasi/hello-world.html) as -well for some more information. - -## Writing Libraries - -Previously for "Hello, World!" we created a *binary* project which used -`src/main.rs`. Not all `*.wasm` binaries are intended to be executed like -commands, though. Some are intended to be loaded into applications and called -through various APIs, acting more like libraries. For this use case you'll want -to add this to `Cargo.toml`: - -```toml -# in Cargo.toml ... - -[lib] -crate-type = ['cdylib'] -``` - -and afterwards you'll want to write your code in `src/lib.rs` like so: - -```rust -#[no_mangle] -pub extern "C" fn print_hello() { - println!("Hello, world!"); -} -``` - -When you execute `cargo wasi build` that'll generate a `*.wasm` file which has -one exported function, `print_hello`. We can then run it via the CLI like so: - -```sh -$ cargo wasi build - Compiling hello-world v0.1.0 (/home/alex/code/hello-world) - Finished dev [unoptimized + debuginfo] target(s) in 0.08s -$ wasmtime --invoke print_hello target/wasm32-wasi/debug/hello_world.wasm -Hello, world! -``` - -As a library crate one of your primary consumers may be other languages as well. -You'll want to consult the [section of this book for using `wasmtime` from -Python](./lang-python.md) and after running through the basics there you can -execute our file in Python: - -```sh -$ cp target/wasm32-wasi/debug/hello_world.wasm . -$ python3 ->>> import wasmtime ->>> import hello_world ->>> hello_world.print_hello() -Hello, world! -() ->>> -``` - -Note that this form of using `#[no_mangle]` Rust functions is pretty primitive. -You're only able to work with primitive datatypes like integers and floats. -While this works for some applications if you need to work with richer types -like strings or structs, then you'll want to use the support in `wasmtime` for -interface types. - -## Exporting Rust functionality - -Currently only Rust functions can be exported from a wasm module. Rust functions -must be `#[no_mangle]` to show up in the final binary. - -Memory is by default exported from Rust modules under the name `memory`. This -can be tweaked with the `-Clink-arg` flag to rustc to pass flags to LLD, the -WebAssembly code linker. - -Tables cannot be imported at this time. When using `rustc` directly there is no -support for `anyref` and only one function table is supported. When using -`wasm-bindgen` it may inject an `anyref` table if necessary, but this table is -an internal detail and is not exported. The function table can be exported by -passing the `--export-table` argument to LLD (via `-C link-arg`) or can be -imported with the `--import-table`. - -Rust currently does not have support for exporting or importing custom `global` -values. - -## Importing host functionality - -Only functions can be imported in Rust at this time, and they can be imported -via raw interfaces like: - -```rust -# struct MyStruct; -#[link(wasm_import_module = "the-wasm-import-module")] -extern "C" { - // imports the name `foo` from `the-wasm-import-module` - fn foo(); - - // functions can have integer/float arguments/return values - fn translate(a: i32) -> f32; - - // Note that the ABI of Rust and wasm is somewhat in flux, so while this - // works, it's recommended to rely on raw integer/float values where - // possible. - fn translate_fancy(my_struct: MyStruct) -> u32; - - // you can also explicitly specify the name to import, this imports `bar` - // instead of `baz` from `the-wasm-import-module`. - #[link_name = "bar"] - fn baz(); -} -``` diff --git a/docs/wasm-wat.md b/docs/wasm-wat.md deleted file mode 100644 index 634d3573090b..000000000000 --- a/docs/wasm-wat.md +++ /dev/null @@ -1,56 +0,0 @@ -# WebAssembly Text Format (`*.wat`) - -While not necessarily a full-blown language you might be curious how Wasmtime -interacts with [the `*.wat` text format][spec]! The `wasmtime` CLI and Rust -embedding API both support the `*.wat` text format by default. - -"Hello, World!" is pretty nontrivial in the `*.wat` format since it's -assembly-like and not really intended to be a primary programming language. That -being said we can create a simple add function to call it! - -For example if you have a file `add.wat` like so: - -```wat -(module - (func (export "add") (param i32 i32) (result i32) - local.get 0 - local.get 1 - i32.add)) -``` - -Then you can execute this on the CLI with: - -```sh -$ wasmtime add.wat --invoke add 1 2 -warning: ... -warning: ... -3 -``` - -And we can see that we're already adding numbers! - -You can also see how this works in the Rust API like so: - -```rust -# extern crate wasmtime; -# extern crate anyhow; -use wasmtime::*; - -# fn main() -> anyhow::Result<()> { -let mut store = Store::<()>::default(); -let wat = r#" - (module - (func (export "add") (param i32 i32) (result i32) - local.get 0 - local.get 1 - i32.add)) -"#; -let module = Module::new(store.engine(), wat)?; -let instance = Instance::new(&mut store, &module, &[])?; -let add = instance.get_typed_func::<(i32, i32), i32>(&mut store, "add")?; -println!("1 + 2 = {}", add.call(&mut store, (1, 2))?); -# Ok(()) -# } -``` - -[spec]: https://webassembly.github.io/spec/core/text/index.html diff --git a/docs/wasm.md b/docs/wasm.md deleted file mode 100644 index fb442f6006e0..000000000000 --- a/docs/wasm.md +++ /dev/null @@ -1,13 +0,0 @@ -# Writing WebAssembly - -Wasmtime is a runtime for *executing* WebAssembly but you also at some point -need to actually produce the WebAssembly module to feed into Wasmtime! This -section of the guide is intended to provide some introductory documentation for -compiling source code to WebAssembly to later run in Wasmtime. There's plenty of -other documentation on the web for doing this, so you'll want to be sure to -check out your language's documentation for WebAssembly as well. - -* [Rust](wasm-rust.md) -* [C/C++](wasm-c.md) -* [AssemblyScript](wasm-assemblyscript.md) -* [WebAssembly Text Format (`*.wat`)](wasm-wat.md) From 6084b73630ae5857ebb1b25ed425e1d1fafc79ed Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 11 Sep 2023 10:10:31 -0700 Subject: [PATCH 10/14] Fix build of `wasmtime-wasi-bench` (#6999) Looks like this wasn't built on CI since it wasn't tested. Flag it as testable which should build it on CI which should catch future errors like this. --- crates/bench-api/Cargo.toml | 1 - crates/bench-api/src/lib.rs | 17 ++++++++++------- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/crates/bench-api/Cargo.toml b/crates/bench-api/Cargo.toml index 138a9d27a48f..83cce9065cae 100644 --- a/crates/bench-api/Cargo.toml +++ b/crates/bench-api/Cargo.toml @@ -12,7 +12,6 @@ publish = false [lib] name = "wasmtime_bench_api" crate-type = ["cdylib"] -test = false doctest = false [dependencies] diff --git a/crates/bench-api/src/lib.rs b/crates/bench-api/src/lib.rs index d293d81664b1..3a29eb94dc23 100644 --- a/crates/bench-api/src/lib.rs +++ b/crates/bench-api/src/lib.rs @@ -305,30 +305,30 @@ pub extern "C" fn wasm_bench_create( .with_context(|| format!("failed to create {}", stdout_path.display()))?; let stdout = cap_std::fs::File::from_std(stdout); let stdout = wasi_cap_std_sync::file::File::from_cap_std(stdout); - cx = cx.stdout(Box::new(stdout)); + cx.stdout(Box::new(stdout)); let stderr = std::fs::File::create(&stderr_path) .with_context(|| format!("failed to create {}", stderr_path.display()))?; let stderr = cap_std::fs::File::from_std(stderr); let stderr = wasi_cap_std_sync::file::File::from_cap_std(stderr); - cx = cx.stderr(Box::new(stderr)); + cx.stderr(Box::new(stderr)); if let Some(stdin_path) = &stdin_path { let stdin = std::fs::File::open(stdin_path) .with_context(|| format!("failed to open {}", stdin_path.display()))?; let stdin = cap_std::fs::File::from_std(stdin); let stdin = wasi_cap_std_sync::file::File::from_cap_std(stdin); - cx = cx.stdin(Box::new(stdin)); + cx.stdin(Box::new(stdin)); } // Allow access to the working directory so that the benchmark can read // its input workload(s). - cx = cx.preopened_dir(working_dir.try_clone()?, ".")?; + cx.preopened_dir(working_dir.try_clone()?, ".")?; // Pass this env var along so that the benchmark program can use smaller // input workload(s) if it has them and that has been requested. if let Ok(val) = env::var("WASM_BENCH_USE_SMALL_WORKLOAD") { - cx = cx.env("WASM_BENCH_USE_SMALL_WORKLOAD", &val)?; + cx.env("WASM_BENCH_USE_SMALL_WORKLOAD", &val)?; } Ok(cx.build()) @@ -467,7 +467,7 @@ impl BenchState { #[cfg(feature = "wasi-nn")] if wasi_modules.wasi_nn { - wasmtime_wasi_nn::add_to_linker(&mut linker, |cx| &mut cx.wasi_nn)?; + wasmtime_wasi_nn::witx::add_to_linker(&mut linker, |cx| &mut cx.wasi_nn)?; } Ok(Self { @@ -509,7 +509,10 @@ impl BenchState { let host = HostState { wasi: (self.make_wasi_cx)().context("failed to create a WASI context")?, #[cfg(feature = "wasi-nn")] - wasi_nn: wasmtime_wasi_nn::WasiNnCtx::new()?, + wasi_nn: { + let (backends, registry) = wasmtime_wasi_nn::preload(&[])?; + wasmtime_wasi_nn::WasiNnCtx::new(backends, registry) + }, }; // NB: Start measuring instantiation time *after* we've created the WASI From 186c3ec8cfc8a6a69f6e23e638282000d04f5d88 Mon Sep 17 00:00:00 2001 From: Joel Dice Date: Mon, 11 Sep 2023 11:35:16 -0600 Subject: [PATCH 11/14] [wasmtime-wasi] fix logic error in `monotonic-clock/subscribe` (#6993) * [wasmtime-wasi] fix logic error in `monotonic-clock/subscribe` When calculating the number of nanoseconds to wait, we should subtract the current time from the deadline, not vice-versa. This was causing guests to sleep indefinitely due to integer underflow. Signed-off-by: Joel Dice * add `sleep` test to `wasi-tests` Note that this is annotated `should_panic` when testing preview1 scenarios, since those won't have preview2 imports. Signed-off-by: Joel Dice --------- Signed-off-by: Joel Dice --- Cargo.lock | 1 + crates/test-programs/tests/wasi-cap-std-sync.rs | 5 +++++ .../tests/wasi-preview1-host-in-preview2.rs | 5 +++++ .../tests/wasi-preview2-components-sync.rs | 4 ++++ .../tests/wasi-preview2-components.rs | 4 ++++ crates/test-programs/tests/wasi-tokio.rs | 5 +++++ crates/test-programs/wasi-tests/Cargo.toml | 3 +++ crates/test-programs/wasi-tests/src/bin/sleep.rs | 16 ++++++++++++++++ crates/wasi/src/preview2/host/clocks.rs | 2 +- 9 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 crates/test-programs/wasi-tests/src/bin/sleep.rs diff --git a/Cargo.lock b/Cargo.lock index 4fc8a705b905..5fe0859d37dd 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -3077,6 +3077,7 @@ dependencies = [ "libc", "once_cell", "wasi", + "wit-bindgen", ] [[package]] diff --git a/crates/test-programs/tests/wasi-cap-std-sync.rs b/crates/test-programs/tests/wasi-cap-std-sync.rs index 6b7143f3b540..860da98024bb 100644 --- a/crates/test-programs/tests/wasi-cap-std-sync.rs +++ b/crates/test-programs/tests/wasi-cap-std-sync.rs @@ -250,6 +250,11 @@ fn sched_yield() { run("sched_yield", true).unwrap() } #[test_log::test] +#[should_panic] +fn sleep() { + run("sleep", true).unwrap() +} +#[test_log::test] fn stdio() { run("stdio", true).unwrap() } diff --git a/crates/test-programs/tests/wasi-preview1-host-in-preview2.rs b/crates/test-programs/tests/wasi-preview1-host-in-preview2.rs index 05dbd32717d9..639504fbe121 100644 --- a/crates/test-programs/tests/wasi-preview1-host-in-preview2.rs +++ b/crates/test-programs/tests/wasi-preview1-host-in-preview2.rs @@ -294,6 +294,11 @@ async fn sched_yield() { run("sched_yield", false).await.unwrap() } #[test_log::test(tokio::test(flavor = "multi_thread"))] +#[should_panic] +async fn sleep() { + run("sleep", false).await.unwrap() +} +#[test_log::test(tokio::test(flavor = "multi_thread"))] async fn stdio() { run("stdio", false).await.unwrap() } diff --git a/crates/test-programs/tests/wasi-preview2-components-sync.rs b/crates/test-programs/tests/wasi-preview2-components-sync.rs index 08b60217be62..df0c995c1578 100644 --- a/crates/test-programs/tests/wasi-preview2-components-sync.rs +++ b/crates/test-programs/tests/wasi-preview2-components-sync.rs @@ -272,6 +272,10 @@ fn sched_yield() { run("sched_yield", false).unwrap() } #[test_log::test] +fn sleep() { + run("sleep", false).unwrap() +} +#[test_log::test] fn stdio() { run("stdio", false).unwrap() } diff --git a/crates/test-programs/tests/wasi-preview2-components.rs b/crates/test-programs/tests/wasi-preview2-components.rs index 77a3e71ff6d8..ffcda698eeff 100644 --- a/crates/test-programs/tests/wasi-preview2-components.rs +++ b/crates/test-programs/tests/wasi-preview2-components.rs @@ -280,6 +280,10 @@ async fn sched_yield() { run("sched_yield", false).await.unwrap() } #[test_log::test(tokio::test(flavor = "multi_thread"))] +async fn sleep() { + run("sleep", false).await.unwrap() +} +#[test_log::test(tokio::test(flavor = "multi_thread"))] async fn stdio() { run("stdio", false).await.unwrap() } diff --git a/crates/test-programs/tests/wasi-tokio.rs b/crates/test-programs/tests/wasi-tokio.rs index 9f8390c917bd..ae7bc15c0388 100644 --- a/crates/test-programs/tests/wasi-tokio.rs +++ b/crates/test-programs/tests/wasi-tokio.rs @@ -256,6 +256,11 @@ async fn sched_yield() { run("sched_yield", true).await.unwrap() } #[test_log::test(tokio::test(flavor = "multi_thread"))] +#[should_panic] +async fn sleep() { + run("sleep", true).await.unwrap() +} +#[test_log::test(tokio::test(flavor = "multi_thread"))] async fn stdio() { run("stdio", true).await.unwrap() } diff --git a/crates/test-programs/wasi-tests/Cargo.toml b/crates/test-programs/wasi-tests/Cargo.toml index 1cf5c4949bcd..bf72d58ea313 100644 --- a/crates/test-programs/wasi-tests/Cargo.toml +++ b/crates/test-programs/wasi-tests/Cargo.toml @@ -9,3 +9,6 @@ publish = false libc = "0.2.65" wasi = "0.11.0" once_cell = "1.12" +wit-bindgen = { workspace = true, default-features = false, features = [ + "macros", +] } diff --git a/crates/test-programs/wasi-tests/src/bin/sleep.rs b/crates/test-programs/wasi-tests/src/bin/sleep.rs new file mode 100644 index 000000000000..a80f11bbff6a --- /dev/null +++ b/crates/test-programs/wasi-tests/src/bin/sleep.rs @@ -0,0 +1,16 @@ +use crate::wasi::{clocks::monotonic_clock, poll::poll}; + +wit_bindgen::generate!({ + path: "../../wasi/wit", + world: "wasmtime:wasi/command-extended", +}); + +fn main() { + // Sleep ten milliseconds. Note that we call the relevant host functions directly rather than go through + // libstd, since we want to ensure we're calling `monotonic_clock::subscribe` with an `absolute` parameter of + // `true`, which libstd won't necessarily do (but which e.g. CPython _will_ do). + poll::poll_oneoff(&[monotonic_clock::subscribe( + monotonic_clock::now() + 10_000_000, + true, + )]); +} diff --git a/crates/wasi/src/preview2/host/clocks.rs b/crates/wasi/src/preview2/host/clocks.rs index c8c1afc4df87..2c461bd804c8 100644 --- a/crates/wasi/src/preview2/host/clocks.rs +++ b/crates/wasi/src/preview2/host/clocks.rs @@ -64,7 +64,7 @@ impl monotonic_clock::Host for T { })))?) } else { let duration = if absolute { - Duration::from_nanos(clock_now - when) + Duration::from_nanos(when - clock_now) } else { Duration::from_nanos(when) }; From 8995750aa4f4e03eef4d113f746e51d1dd034a9f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 11 Sep 2023 15:05:35 -0700 Subject: [PATCH 12/14] Redesign Wasmtime's CLI (#6925) * Redesign Wasmtime's CLI This commit follows through on discussion from #6741 to redesign the flags that the `wasmtime` binary accepts on the CLI. Almost all flags have been renamed/moved and will require callers to update. The main motivation here is to cut down on the forest of options in `wasmtime -h` which are difficult to mentally group together and understand. The main change implemented here is to move options behind "option groups" which are intended to be abbreviated with a single letter: * `-O foo` - an optimization or performance-tuning related option * `-C foo` - a codegen option affecting the compilation process. * `-D foo` - a debug-related option * `-W foo` - a wasm-related option, for example changing wasm semantics * `-S foo` - a WASI-related option, configuring various proposals for example Each option group can be explored by passing `help`, for example `-O help`. This will print all options within the group along with their help message. Additionally `-O help-long` can be passed to print the full comment for each option if desired. Option groups can be specified multiple times on the command line, for example `-Wrelaxed-simd -Wthreads`. They can also be combined together with commas as `-Wrelaxed-simd,threads`. Configuration works as a "last option wins" so `-Ccache,cache=n` would end up with a compilation cache disabled. Boolean options can be specified as `-C foo` to enable `foo`, or they can be specified with `-Cfoo=$val` with any of `y`, `n`, `yes`, `no`, `true`, or `false`. All other options require a `=foo` value to be passed and the parsing depends on the type. This commit additionally applies a few small refactorings to the CLI as well. For example the help text no longer prints information about wasm features after printing the option help. This is still available via `-Whelp` as all wasm features have moved from `--wasm-features` to `-W`. Additionally flags are no longer conditionally compiled in, but instead all flags are always supported. A runtime error is returned if support for a flag is not compiled in. Additionally the "experimental" name of WASI proposals has been dropped in favor of just the name of the proposal, for example `--wasi nn` instead of `--wasi-modules experimental-wasi-nn`. This is intended to mirror how wasm proposals don't have "experimental" in the name and an opt-in is required regardless. A full listing of flags and how they have changed is: | old cli flag | new cli flag | |-----------------------------------------------|-------------------------------------------------| | `-O, --optimize` | removed | | `--opt-level ` | `-O opt-level=N` | | `--dynamic-memory-guard-size ` | `-O dynamic-memory-guard-size=...` | | `--static-memory-forced` | `-O static-memory-forced` | | `--static-memory-guard-size ` | `-O static-memory-guard-size=N` | | `--static-memory-maximum-size ` | `-O static-memory-maximum-size=N` | | `--dynamic-memory-reserved-for-growth ` | `-O dynamic-memory-reserved-for-growth=...` | | `--pooling-allocator` | `-O pooling-allocator` | | `--disable-memory-init-cow` | `-O memory-init-cow=no` | | `--compiler ` | `-C compiler=..` | | `--enable-cranelift-debug-verifier` | `-C cranelift-debug-verifier` | | `--cranelift-enable ` | `-C cranelift-NAME` | | `--cranelift-set ` | `-C cranelift-NAME=VALUE` | | `--config ` | `-C cache-config=..` | | `--disable-cache` | `-C cache=no` | | `--disable-parallel-compilation` | `-C parallel-compilation=no` | | `-g` | `-D debug-info` | | `--disable-address-map` | `-D address-map=no` | | `--disable-logging` | `-D logging=no` | | `--log-to-files` | `-D log-to-files` | | `--coredump-on-trap ` | `-D coredump=..` | | `--wasm-features all` | `-W all-proposals` | | `--wasm-features -all` | `-W all-proposals=n` | | `--wasm-features bulk-memory` | `-W bulk-memory` | | `--wasm-features multi-memory` | `-W multi-memory` | | `--wasm-features multi-value` | `-W multi-value` | | `--wasm-features reference-types` | `-W reference-types` | | `--wasm-features simd` | `-W simd` | | `--wasm-features tail-call` | `-W tail-call` | | `--wasm-features threads` | `-W threads` | | `--wasm-features memory64` | `-W memory64` | | `--wasm-features copmonent-model` | `-W component-model` | | `--wasm-features function-references` | `-W function-references` | | `--relaxed-simd-deterministic` | `-W relaxed-simd-deterministic` | | `--enable-cranelift-nan-canonicalization` | `-W nan-canonicalization` | | `--fuel ` | `-W fuel=N` | | `--epoch-interruption` | `-W epoch-interruption` | | `--allow-unknown-exports` | `-W unknown-exports-allow` | | `--trap-unknown-imports` | `-W unknown-imports-trap` | | `--default-values-unknown-imports` | `-W unknown-imports-default` | | `--max-instances ` | `-W max-instances=N` | | `--max-memories ` | `-W max-memories=N` | | `--max-memory-size ` | `-W max-memory-size=N` | | `--max-table-elements ` | `-W max-table-elements=N` | | `--max-tables ` | `-W max-tables=N` | | `--max-wasm-stack ` | `-W max-wasm-stack=N` | | `--trap-on-grow-failure` | `-W trap-on-grow-failure` | | `--wasm-timeout