Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use simpler ops for binary instructions (fixes GCC issue) #513

Merged
merged 1 commit into from
Sep 2, 2020

Conversation

chfast
Copy link
Collaborator

@chfast chfast commented Aug 27, 2020

This replaces usage of std::plus{} and others with simpler in-house
implementation. The difference is that the std:: versions are passing
arguments by reference. In GCC builds without optimization this may
cause the arguments to be swapped in commutative instructions. This is
at least annoyance for floating-point add which may produce different
NaN results if both inputs are NaNs. This is not a problem for Wasm
conformance, but prevents running exhaustive tests against software
floats implementations.

@chfast chfast requested review from axic and gumb0 August 27, 2020 07:50
@codecov
Copy link

codecov bot commented Aug 27, 2020

Codecov Report

Merging #513 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #513   +/-   ##
=======================================
  Coverage   99.69%   99.69%           
=======================================
  Files          54       54           
  Lines       17183    17198   +15     
=======================================
+ Hits        17131    17146   +15     
  Misses         52       52           

template <typename T>
inline constexpr T add(T a, T b) noexcept
{
return a + b;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we add the asserts for is_integral||is_float to avoid any random overloading.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is big overkill and still is not complete (as bool, char, long double would be a match). I can't imagine accidentally adding two things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically what could happen is that Value is overloaded and this matches for Value+Value. Why would it be an overkill?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is perfectly valid for any type that is copyable and has + operator. The Value does not support + operator as of #464. Also all usages of it explicitly specify the type.

}

template <typename T>
inline constexpr T rem(T a, T b) noexcept
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call it mod?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because i32.rem_s

@gumb0
Copy link
Collaborator

gumb0 commented Sep 1, 2020

In GCC builds without optimization this may
cause the arguments to be swapped in commutative instructions.

So it was a problem only for plus and multiplies?

Do we replace std::minus, std::divides and std::modulus only for consistency? (if yes, there are other std:: helpers are still used)

@chfast
Copy link
Collaborator Author

chfast commented Sep 1, 2020

So it was a problem only for plus and multiplies?

Do we replace std::minus, std::divides and std::modulus only for consistency? (if yes, there are other std:: helpers are still used)

The problem seems to be only for add. But I replaced also other functions in the same group (arithmetic binops).
My preference is to replace also other functions to limit dependency on standard library and do not use references as well. But maybe not in the "fixes GCC issue" PR.

@gumb0
Copy link
Collaborator

gumb0 commented Sep 2, 2020

I'm getting significant boost with it (with clang)

fizzy/execute/blake2b/512_bytes_rounds_1_mean                     -0.1525         -0.1528           204           173           204           173
fizzy/execute/blake2b/512_bytes_rounds_16_mean                    -0.1525         -0.1520          3064          2596          3062          2596
fizzy/execute/ecpairing/onepoint_mean                             -0.1359         -0.1357       1037524        896474       1037174        896418
fizzy/execute/keccak256/512_bytes_rounds_1_mean                   -0.0736         -0.0735           236           218           236           218
fizzy/execute/keccak256/512_bytes_rounds_16_mean                  -0.1002         -0.1001          3493          3143          3492          3143
fizzy/execute/memset/256_bytes_mean                               -0.1737         -0.1737            17            14            17            14
fizzy/execute/memset/60000_bytes_mean                             -0.1837         -0.1837          3735          3049          3734          3048
fizzy/execute/mul256_opt0/input0_mean                             -0.1336         -0.1336            66            57            66            57
fizzy/execute/mul256_opt0/input1_mean                             -0.1346         -0.1348            66            57            66            57
fizzy/execute/ramanujan_pi/33_runs_mean                           -0.1746         -0.1748           275           227           275           227
fizzy/execute/sha1/512_bytes_rounds_1_mean                        -0.1765         -0.1767           219           180           219           180
fizzy/execute/sha1/512_bytes_rounds_16_mean                       -0.1815         -0.1816          3046          2493          3046          2493
fizzy/execute/sha256/512_bytes_rounds_1_mean                      -0.1490         -0.1491           198           169           198           169
fizzy/execute/sha256/512_bytes_rounds_16_mean                     -0.1546         -0.1547          2720          2299          2720          2299
fizzy/execute/taylor_pi/pi_1000000_runs_mean                      -0.0130         -0.0131         92831         91622         92832         91617
fizzy/execute/micro/eli_interpreter/halt_mean                     -0.0946         -0.0947             0             0             0             0
fizzy/execute/micro/eli_interpreter/exec105_mean                  -0.1084         -0.1084            11            10            11            10
fizzy/execute/micro/factorial/10_mean                             -0.0662         -0.0662             1             1             1             1
fizzy/execute/micro/factorial/20_mean                             -0.0488         -0.0488             2             2             2             2
fizzy/execute/micro/fibonacci/24_mean                             -0.0667         -0.0667         16591         15485         16590         15484
fizzy/execute/micro/host_adler32/1_mean                           -0.0574         -0.0574             0             0             0             0
fizzy/execute/micro/host_adler32/100_mean                         -0.0176         -0.0176             7             7             7             7
fizzy/execute/micro/host_adler32/1000_mean                        -0.0207         -0.0207            72            71            72            71
fizzy/execute/micro/spinner/1_mean                                -0.0392         -0.0392             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                             -0.1289         -0.1289            22            19            22            19

@chfast
Copy link
Collaborator Author

chfast commented Sep 2, 2020

Oh, nice. I will check my machine soon.

This replaces usage of std::plus{} and others with simpler in-house
implementation. The difference is that the std:: versions are passing
arguments by reference. In GCC builds without optimization this may
cause the arguments to be swapped in commutative instructions. This is
at least annoyance for floating-point add which may produce different
NaN results if both inputs are NaNs. This is not a problem for Wasm
conformance, but prevents running exhaustive tests against software
floats implementations.
@chfast
Copy link
Collaborator Author

chfast commented Sep 2, 2020

I checked GCC10 and Clang10 and both generate unchanged assembly.

@chfast chfast merged commit eacb0ed into master Sep 2, 2020
@chfast chfast deleted the simple_ops branch September 2, 2020 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants