Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In LLVM backend, track which floats are guaranteed to be arithmetic, which makes the canonicalization a no-op. #934

Merged
merged 8 commits into from
Nov 26, 2019

Conversation

nlewycky
Copy link
Contributor

@nlewycky nlewycky commented Nov 7, 2019

Description

This is a reimplementation of the patch in PR #651.

Extend state.rs ExtraInfo to track more information about floats. In addition to tracking whether the value has a pending canonicalization of NaNs, also track whether the value is known to be arithmetic (which includes infinities, regular values, and non-signalling NaNs (aka. "arithmetic NaNs" in the webassembly spec)). When the value is arithmetic, the correct sequence of operations to canonicalize the value is a no-op. Therefore, we create a lattice where pending+arithmetic=arithmetic.

Also, this extends the tracking to track all values, including non-SIMD integers. That's why there are more places where pending canonicalizations are applied.

Looking at c-wasm-simd128-example, this provides no performance change to the non-SIMD case (takes 58s on my noisy dev machine). The SIMD case drops from 46s to 29s.

Review

  • Add a short description of the the change to the CHANGELOG.md file

@nlewycky nlewycky requested a review from losfair as a code owner November 7, 2019 18:53
pub struct ExtraInfo {
state: u8,
}
impl ExtraInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to distinguish between f32 and f64 NaNs in ExtraInfo, since the opcodes and values are already typed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the SIMD opcodes which aren't typed. They're all just v128, so we might go from a F64x2Add to a F32x4Mul with no cast in between.

v128_to_f32x4 checks for a pending f64 canonicalization and applies it, while not applying a pending f32 canonicalization. v128_to_f64x2 does the opposite.

@nlewycky nlewycky force-pushed the feature/llvm-nan-fix-2 branch from 446b48c to 6b2f17b Compare November 19, 2019 21:25
Copy link
Contributor

@MarkMcCaskey MarkMcCaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Looks like there's a few small mistakes but otherwise good!

CHANGELOG.md Outdated
@@ -24,6 +24,7 @@ Special thanks to [@newpavlov](https://github.com/newpavlov) and [@Maxgy](https:
- [#939](https://github.com/wasmerio/wasmer/pull/939) Fix bug causing attempts to append to files with WASI to delete the contents of the file
- [#940](https://github.com/wasmerio/wasmer/pull/940) Update supported Rust version to 1.38+
- [#923](https://github.com/wasmerio/wasmer/pull/923) Fix memory leak in the C API caused by an incorrect cast in `wasmer_trampoline_buffer_destroy`
- [#934](https://github.com/wasmerio/wasmer/pull/934) Simplify float expressions in the LLVM backend.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog entry should be moved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// machine, but which might not be in the LLVM value. The conversion to
// arithmetic NaN is pending. It is required for correctness.
PendingF32NaN,
pub fn pending_f32_nan() -> ExtraInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions can be const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ExtraInfo { state: 8 }
}

pub fn has_pending_f32_nan(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions may be able to be const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are. Done.

ExtraInfo::None
ExtraInfo { state: 0 }
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as derive(Default) on ExtraInfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it is! Done.


fn bitor(self, other: Self) -> Self {
assert!(!(self.has_pending_f32_nan() && other.has_pending_f64_nan()));
assert!(!(self.has_pending_f64_nan() && other.has_pending_f32_nan()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you only want these in debug mode, use debug_assert!, these asserts will run on every use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.

lib/llvm-backend/src/code.rs Show resolved Hide resolved
It seemed like a good idea at the time, but in practice we discard the extra info all or almost all of the time.

This also introduces a new bug. In an operation like multiply, it's valid to multiply two values, one with a pending NaN and one without. As written, in the SIMD case (because of the two kinds of pending in play), we assert.
Unfortunately, this is quite buggy. For something as simple as F32Sub, to combine two ExtraInfos, we want to add a new pending_f32_nan(), unless both of the inputs are arithmetic_f32(). In this commit, we incorrectly calculate that we don't need a pending_f32_nan if either one of the inputs was arithmetic_f32().
We want to ignore the incoming pending NaN state (since the pending will propagate to the output if there was one on the input), and we want to add a new pending NaN state if we can (that is to say, if it isn't cancelled out by both inputs having arithmetic state). Do this by discarding the pending states on the inputs, intersecting them (to keep only the arithmetic state), then union in a pending nan state (which might do nothing, if it's arithmetic).

If the above sounds confusing, keep in mind that when a value is arithmetic, the act of performing the "NaN canonicalization" is a no-op. Thus, being arithmetic cancels out pending NaN states.
Fix a bug in Operator::Select and add a comment to explain the intention.

Use derived default for ExtraInfo.

Make ExtraInfo associated functions const.

Turn two asserts into debug_asserts.
@nlewycky nlewycky force-pushed the feature/llvm-nan-fix-2 branch from 6b2f17b to ff73c5d Compare November 26, 2019 20:29
@nlewycky
Copy link
Contributor Author

bors r+

bors bot added a commit that referenced this pull request Nov 26, 2019
934: In LLVM backend, track which floats are guaranteed to be arithmetic, which makes the canonicalization a no-op. r=nlewycky a=nlewycky

# Description
This is a reimplementation of the patch in PR #651.

Extend state.rs ExtraInfo to track more information about floats. In addition to tracking whether the value has a pending canonicalization of NaNs, also track whether the value is known to be arithmetic (which includes infinities, regular values, and non-signalling NaNs (aka. "arithmetic NaNs" in the webassembly spec)). When the value is arithmetic, the correct sequence of operations to canonicalize the value is a no-op. Therefore, we create a lattice where pending+arithmetic=arithmetic.

Also, this extends the tracking to track all values, including non-SIMD integers. That's why there are more places where pending canonicalizations are applied.

Looking at c-wasm-simd128-example, this provides no performance change to the non-SIMD case (takes 58s on my noisy dev machine). The SIMD case drops from 46s to 29s.

# Review

- [ ] Add a short description of the the change to the CHANGELOG.md file


Co-authored-by: Nick Lewycky <[email protected]>
@bors
Copy link
Contributor

bors bot commented Nov 26, 2019

Build succeeded

@bors bors bot merged commit ff73c5d into master Nov 26, 2019
@bors bors bot deleted the feature/llvm-nan-fix-2 branch November 26, 2019 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants