<charconv>: simplify _Assemble_floating_point_value and optimize _Right_shift_with_rounding #1220

statementreply · 2020-08-21T16:36:41Z

Modify the behavior of _Assemble_floating_point_value_no_shift (renamed to avoid potential ODR issue) to gracefully handle the special cases below, and remove special case handling code from _Assemble_floating_point_value.
- When the significand carries over to a higher bit after rounding up, we need to renormalize the significand and increase the exponent to keep the significand within the normalized range.
  (example: 0x1.fffffffffffff8p+0 rounds to 0x1.0000000000000p+1)
- In some cases, the new exponent becomes greater than the maximum exponent of the floating point format, so the result overflows.
  (example: 0x1.fffffffffffff8p+1023 overflows to ∞)
- In some cases, the mantissa of a subnormal value becomes normalized after rounding up, so the result becomes a normal value.
  (example: 0x0.fffffffffffff8p-1022 rounds to 0x1.0000000000000-p1022)
Optimize _Right_shift_with_rounding with the branchless rounding technique in hex to_chars.

When the significand carries over to a higher bit after rounding up, we need to renormalize the significand and increase the exponent to keep the significand within the normalized range. (example: 0x1.fffffffffffff8p+0 rounds to 0x1.0000000000000p+1) In some cases, the new exponent becomes greater than the maximum exponent of the floating point format, so the result overflows. (example: 0x1.fffffffffffff8p+1023 overflows to inf) In some cases, the mantissa of a subnormal value becomes normalized after rounding up, so the result becomes a normal value. (example: 0x0.fffffffffffff8p-1022 rounds to 0x1.0000000000000-p1022) This commit modifies the behavior of _Assemble_floating_point_value_t (and renames it to _Assemble_floating_point_value_no_shift to avoid potential ODR issues) to gracefully handle the cases above, and removes special case handling code from _Assemble_floating_point_value.

Use the branchless rounding technique in hex to_chars with minor modifications to handle input tail bits.

StephanTLavavej

Thanks! I think I understand the overall approach here, and the new assembly process makes more sense than the previous control flow. I'm marking this as Request Changes for the "tail bits" bug that I believe I found (plus testing). There's an additional question about space/time improvements for the rounding technique.

stl/inc/charconv

statementreply · 2020-08-29T11:54:25Z

Here are my benchmark results (Intel Core i5-8400, fixed CPU clock speed at 2.7 GHz, VS 2019 16.8 Preview 2, Clang/LLVM 11.0.0-rc1). The measured times are average nanoseconds per floating-point string.

Times with default dynamic CPU clock speed setting are around 0.7x the values below.

Scenario	MSVC Baseline	MSVC +Assemble	MSVC +Rounding	MSVC +Both	LLVM Baseline	LLVM +Assemble	LLVM +Rounding	LLVM +Both
x64 `float` hex 5 (exact)	49.0	49.4	48.7	49.2	51.9	51.7	51.5	51.4
x64 `float` hex 6 (rounding)	61.2	61.2	52.2	💚 53.1	55.5	56.4	54.7	55.1
x64 `double` hex 13 (exact)	64.6	62.3	64.4	✔️ 62.4	66.8	66.9	66.7	66.9
x64 `double` hex 14 (rounding)	75.4	74.2	67.8	💚 66.7	70.9	71.9	70.3	70.9
x64 `float` plain shortest roundtrip	159.9	162.3	151.2	✔️ 153.4	144.4	146.4	145.8	143.9
x64 `double` plain shortest roundtrip	247.2	244.9	237.3	✔️ 232.8	222.4	225.6	224.5	224.0
x86 `float` hex 5 (exact)	62.0	61.1	62.0	61.1	61.3	60.4	61.5	60.3
x86 `float` hex 6 (rounding)	86.3	85.4	71.6	💚 71.0	70.8	71.8	68.2	68.8
x86 `double` hex 13 (exact)	80.6	83.6	80.7	❌ 83.6	78.2	78.7	78.9	79.8
x86 `double` hex 14 (rounding)	106.7	106.9	91.2	💚 91.2	91.6	89.6	87.9	89.5
x86 `float` plain shortest roundtrip	205.2	204.5	186.5	💚 185.9	175.0	175.4	169.9	✔️ 169.9
x86 `double` plain shortest roundtrip	311.0	316.4	293.2	✔️ 297.3	267.5	271.2	265.7	266.2

stl/inc/charconv

StephanTLavavej

Thanks for the detailed perf data - nice improvements for MSVC! I'll push a one-line change to use logical AND with bool. FYI @cbezault as you had previously approved.

StephanTLavavej · 2020-10-03T02:27:08Z

Thanks again for improving <charconv> again! 🚀 😸

statementreply added 2 commits August 21, 2020 23:17

Optimize _Right_shift_with_rounding

1f9b328

Use the branchless rounding technique in hex to_chars with minor modifications to handle input tail bits.

statementreply requested a review from a team as a code owner August 21, 2020 16:36

Clarify comment on from_chars underflow behavior

0b1a645

StephanTLavavej added the performance Must go faster label Aug 22, 2020

Fix typo and revert deleted comments

6b836ef

StephanTLavavej self-assigned this Aug 26, 2020

mnatsuhara assigned cbezault Aug 26, 2020

StephanTLavavej requested changes Aug 29, 2020

View reviewed changes

StephanTLavavej removed their assignment Aug 29, 2020

statementreply added 3 commits August 29, 2020 14:41

Fix rounding after right shift by 64 bits

19f7681

s/floating point/floating-point/g

930fc8d

_Shift is uint32_t

9e52373

cbezault approved these changes Sep 2, 2020

View reviewed changes

mnatsuhara assigned StephanTLavavej and unassigned cbezault Sep 2, 2020

Remove TRANSITION comment

ec7a66d

StephanTLavavej reviewed Oct 2, 2020

View reviewed changes

stl/inc/charconv Outdated Show resolved Hide resolved

stl/inc/charconv Outdated Show resolved Hide resolved

StephanTLavavej reviewed Oct 2, 2020

View reviewed changes

Use logical AND for bools

d9ca063

StephanTLavavej approved these changes Oct 2, 2020

View reviewed changes

StephanTLavavej assigned StephanTLavavej and unassigned StephanTLavavej Oct 2, 2020

StephanTLavavej merged commit 974582f into microsoft:master Oct 3, 2020

statementreply deleted the simplify_assemble_float branch April 17, 2021 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

<charconv>: simplify _Assemble_floating_point_value and optimize _Right_shift_with_rounding #1220

<charconv>: simplify _Assemble_floating_point_value and optimize _Right_shift_with_rounding #1220

statementreply commented Aug 21, 2020

Uh oh!

StephanTLavavej left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

statementreply commented Aug 29, 2020 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej left a comment

Uh oh!

StephanTLavavej commented Oct 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

<charconv>: simplify _Assemble_floating_point_value and optimize _Right_shift_with_rounding #1220

<charconv>: simplify _Assemble_floating_point_value and optimize _Right_shift_with_rounding #1220

Conversation

statementreply commented Aug 21, 2020

Uh oh!

StephanTLavavej left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

statementreply commented Aug 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej left a comment

Choose a reason for hiding this comment

Uh oh!

StephanTLavavej commented Oct 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

statementreply commented Aug 29, 2020 •

edited

Loading