Improve codegen for unchecked float casts on wasm #74659

alexcrichton · 2020-07-22T22:06:28Z

This commit improves codegen for unchecked casts on WebAssembly targets
to use the singluar iNN.trunc_fMM_{u,s} instructions. Previously rustc
would codegen a bare fptosi and fptoui for float casts but for
WebAssembly targets the codegen for these instructions is quite large.
This large codegen is due to the fact that LLVM can speculate these
instructions so the trapping behavior of WebAssembly needs to be
protected against in case they're speculated.

The change here is to update the codegen for the unchecked cast
intrinsics to have a wasm-specific case where they call the appropriate
LLVM intrinsic to generate the right wasm instruction. The intrinsic is
explicitly opting-in to undefined behavior so a trap here for
out-of-bounds inputs on wasm should be acceptable.

cc #73591

This commit improves codegen for unchecked casts on WebAssembly targets to use the singluar `iNN.trunc_fMM_{u,s}` instructions. Previously rustc would codegen a bare `fptosi` and `fptoui` for float casts but for WebAssembly targets the codegen for these instructions is quite large. This large codegen is due to the fact that LLVM can speculate these instructions so the trapping behavior of WebAssembly needs to be protected against in case they're speculated. The change here is to update the codegen for the unchecked cast intrinsics to have a wasm-specific case where they call the appropriate LLVM intrinsic to generate the right wasm instruction. The intrinsic is explicitly opting-in to undefined behavior so a trap here for out-of-bounds inputs on wasm should be acceptable. cc rust-lang#73591

rust-highfive · 2020-07-22T22:06:31Z

r? @varkor

(rust_highfive has picked a reviewer for you, use r? to override)

CryZe · 2020-07-22T22:19:14Z

Ideally we would also use this underneath the saturating code we generate. I tried that a few weeks ago but I noticed that the guard code is select based, meaning it would trap too early. Is this something we should still pursue?

alexcrichton · 2020-07-22T22:21:18Z

I personally think it's still worth pursuing, but we'll need to do more invasive changes there by creating entirely new basic blocks instead of using a select instruction. I suspect that's fine, it's just a chunk of work I hadn't gotten to yet and figured that this would still be beneficial to enable on wasm targets.

varkor · 2020-07-22T23:51:54Z

@bors r+ rollup

bors · 2020-07-22T23:51:56Z

📌 Commit 618aeec has been approved by varkor

@ghost

…arth Rollup of 8 pull requests Successful merges: - rust-lang#74141 (libstd/libcore: fix various typos) - rust-lang#74490 (add a Backtrace::disabled function) - rust-lang#74548 (one more Path::with_extension example, to demonstrate behavior) - rust-lang#74587 (Prefer constant over function) - rust-lang#74606 (Remove Linux workarounds for missing CLOEXEC support) - rust-lang#74637 (Make str point to primitive page) - rust-lang#74654 (require type defaults to be after const generic parameters) - rust-lang#74659 (Improve codegen for unchecked float casts on wasm) Failed merges: r? @ghost

@1

This commit improves code generation for WebAssembly targets when translating floating to integer casts. This improvement is only relevant when the `nontrapping-fptoint` feature is not enabled, but the feature is not enabled by default right now. Additionally this improvement only affects safe casts since unchecked casts were improved in rust-lang#74659. Some more background for this issue is present on rust-lang#73591, but the general gist of the issue is that in LLVM the `fptosi` and `fptoui` instructions are defined to return an `undef` value if they execute on out-of-bounds values; they notably do not trap. To implement these instructions for WebAssembly the LLVM backend must therefore generate quite a few instructions before executing `i32.trunc_f32_s` (for example) because this WebAssembly instruction traps on out-of-bounds values. This codegen into wasm instructions happens very late in the code generator, so what ends up happening is that rustc inserts its own codegen to implement Rust's saturating semantics, and then LLVM also inserts its own codegen to make sure that the `fptosi` instruction doesn't trap. Overall this means that a function like this: #[no_mangle] pub unsafe extern "C" fn cast(x: f64) -> u32 { x as u32 } will generate this WebAssembly today: (func $cast (type 0) (param f64) (result i32) (local i32 i32) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.gt local.set 1 block ;; label = @1 block ;; label = @2 local.get 0 f64.const 0x0p+0 (;=0;) local.get 0 f64.const 0x0p+0 (;=0;) f64.gt select local.tee 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@2;) local.get 0 i32.trunc_f64_u local.set 2 br 1 (;@1;) end i32.const 0 local.set 2 end i32.const -1 local.get 2 local.get 1 select) This PR improves the situation by updating the code generation for float-to-int conversions in rustc, specifically only for WebAssembly targets and only for some situations (float-to-u8 still has not great codegen). The fix here is to use basic blocks and control flow to avoid speculatively executing `fptosi`, and instead LLVM's raw intrinsic for the WebAssembly instruction is used instead. This effectively extends the support added in rust-lang#74659 to checked casts. After this commit the codegen for the above Rust function looks like: (func $cast (type 0) (param f64) (result i32) (local i32) block ;; label = @1 local.get 0 f64.const 0x0p+0 (;=0;) f64.ge local.tee 1 i32.const 1 i32.xor br_if 0 (;@1;) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.le i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const -1 i32.const 0 local.get 1 select) For reference, in Rust 1.44, which did not have saturating float-to-integer casts, the codegen LLVM would emit is: (func $cast (type 0) (param f64) (result i32) block ;; label = @1 local.get 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const 0) So we're relatively close to the original codegen, although it's slightly different because the semantics of the function changed where we're emulating the `i32.trunc_sat_f32_s` instruction rather than always replacing out-of-bounds values with zero. There is still work that could be done to improve casts such as `f32` to `u8`. That form of cast still uses the `fptosi` instruction which generates lots of branch-y code. This seems less important to tackle now though. In the meantime this should take care of most use cases of floating-point conversion and as a result I'm going to speculate that this... Closes rust-lang#73591

@1

…es, r=nagisa rustc: Improving safe wasm float->int casts This commit improves code generation for WebAssembly targets when translating floating to integer casts. This improvement is only relevant when the `nontrapping-fptoint` feature is not enabled, but the feature is not enabled by default right now. Additionally this improvement only affects safe casts since unchecked casts were improved in rust-lang#74659. Some more background for this issue is present on rust-lang#73591, but the general gist of the issue is that in LLVM the `fptosi` and `fptoui` instructions are defined to return an `undef` value if they execute on out-of-bounds values; they notably do not trap. To implement these instructions for WebAssembly the LLVM backend must therefore generate quite a few instructions before executing `i32.trunc_f32_s` (for example) because this WebAssembly instruction traps on out-of-bounds values. This codegen into wasm instructions happens very late in the code generator, so what ends up happening is that rustc inserts its own codegen to implement Rust's saturating semantics, and then LLVM also inserts its own codegen to make sure that the `fptosi` instruction doesn't trap. Overall this means that a function like this: #[no_mangle] pub unsafe extern "C" fn cast(x: f64) -> u32 { x as u32 } will generate this WebAssembly today: (func $cast (type 0) (param f64) (result i32) (local i32 i32) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.gt local.set 1 block ;; label = @1 block ;; label = @2 local.get 0 f64.const 0x0p+0 (;=0;) local.get 0 f64.const 0x0p+0 (;=0;) f64.gt select local.tee 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@2;) local.get 0 i32.trunc_f64_u local.set 2 br 1 (;@1;) end i32.const 0 local.set 2 end i32.const -1 local.get 2 local.get 1 select) This PR improves the situation by updating the code generation for float-to-int conversions in rustc, specifically only for WebAssembly targets and only for some situations (float-to-u8 still has not great codegen). The fix here is to use basic blocks and control flow to avoid speculatively executing `fptosi`, and instead LLVM's raw intrinsic for the WebAssembly instruction is used instead. This effectively extends the support added in rust-lang#74659 to checked casts. After this commit the codegen for the above Rust function looks like: (func $cast (type 0) (param f64) (result i32) (local i32) block ;; label = @1 local.get 0 f64.const 0x0p+0 (;=0;) f64.ge local.tee 1 i32.const 1 i32.xor br_if 0 (;@1;) local.get 0 f64.const 0x1.fffffffep+31 (;=4.29497e+09;) f64.le i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const -1 i32.const 0 local.get 1 select) For reference, in Rust 1.44, which did not have saturating float-to-integer casts, the codegen LLVM would emit is: (func $cast (type 0) (param f64) (result i32) block ;; label = @1 local.get 0 f64.const 0x1p+32 (;=4.29497e+09;) f64.lt local.get 0 f64.const 0x0p+0 (;=0;) f64.ge i32.and i32.eqz br_if 0 (;@1;) local.get 0 i32.trunc_f64_u return end i32.const 0) So we're relatively close to the original codegen, although it's slightly different because the semantics of the function changed where we're emulating the `i32.trunc_sat_f32_s` instruction rather than always replacing out-of-bounds values with zero. There is still work that could be done to improve casts such as `f32` to `u8`. That form of cast still uses the `fptosi` instruction which generates lots of branch-y code. This seems less important to tackle now though. In the meantime this should take care of most use cases of floating-point conversion and as a result I'm going to speculate that this... Closes rust-lang#73591

rust-highfive assigned varkor Jul 22, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 22, 2020

alexcrichton mentioned this pull request Jul 22, 2020

Redundant checks in floating point to integer casts on WASM #73591

Closed

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 22, 2020

Manishearth mentioned this pull request Jul 23, 2020

Rollup of 8 pull requests #74667

Merged

bors merged commit 8f02f2c into rust-lang:master Jul 23, 2020

alexcrichton mentioned this pull request Jul 23, 2020

rustc: Improving safe wasm float->int casts #74695

Merged

alexcrichton deleted the wasm-codegen branch July 23, 2020 21:21

cuviper added this to the 1.47.0 milestone May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve codegen for unchecked float casts on wasm #74659

Improve codegen for unchecked float casts on wasm #74659

alexcrichton commented Jul 22, 2020

rust-highfive commented Jul 22, 2020

CryZe commented Jul 22, 2020

alexcrichton commented Jul 22, 2020

varkor commented Jul 22, 2020

bors commented Jul 22, 2020

Improve codegen for unchecked float casts on wasm #74659

Improve codegen for unchecked float casts on wasm #74659

Conversation

alexcrichton commented Jul 22, 2020

rust-highfive commented Jul 22, 2020

CryZe commented Jul 22, 2020

alexcrichton commented Jul 22, 2020

varkor commented Jul 22, 2020

bors commented Jul 22, 2020