Recalculate lerp if we got infinity. Eliminates some overflows. #1918

fsb4000 · 2021-05-11T05:14:05Z

Fixes #1917

stl/inc/cmath

tests/std/tests/P0811R3_midpoint_lerp/test.cpp

stl/inc/cmath

fsb4000 · 2021-05-12T08:32:28Z

It breaks monotonicity.

Thank you for finding some issues. I fixed some but I have no idea how to restore monotonicity for lerp :(

statementreply · 2021-05-12T12:54:10Z

I fixed some but I have no idea how to restore monotonicity for lerp :(

Never mind. It can be tricky to implement floating point functionalities when there are requirements on precision, handling of extreme values, and/or preservation of math properties.

~~I'm going to open a PR (maybe on the weekend) with an alternative approach to fix the false overflow.~~ We need to investigate whether there's a simple-ish fix for the original issue.

stl/inc/cmath

abolz · 2021-05-12T15:58:42Z

FWIW, this could be fixed with a fused multiply-add

--const auto _Candidate = _ArgA + _ArgT * (_ArgB - _ArgA);
++const auto _Candidate = std::fma(_ArgT, _ArgB - _ArgA, _ArgA);

But fma is not constexpr.

stl/inc/cmath

fsb4000 · 2021-05-13T11:17:23Z

@statementreply

Well, at least we can reuse new test code :(

What do you think about fma? It seems it doesn't overflow but we can get different results for runtime/compiletime invocation of lerp...

https://gcc.godbolt.org/z/T564z58qo

stl/inc/cmath

MattStephanson · 2021-05-14T06:33:14Z

The condition we have to handle here is limited to |t*(b-a)| ≤ 2*MEOW_MAX, right? Otherwise we'd have a true, not spurious, overflow. If that's the case, maybe we can simply scale the calculation by a fixed factor.

auto _Smaller     = _ArgT;
auto _Larger      = _ArgB - _ArgA;
auto _Abs_smaller = _Float_abs(_Smaller);
auto _Abs_larger  = _Float_abs(_Larger);
if (_Abs_larger < _Abs_smaller) {
    _STD swap(_Smaller, _Larger);
    _STD swap(_Abs_smaller, _Abs_larger);
}

if (_Abs_larger > _STD sqrt(numeric_limits<_Ty>::max()) && _Abs_smaller > 1) {
    return 2 * (_Ty{0.5} * _ArgA + _Smaller * (_Ty{0.5} * _Larger));
} else {
    return _ArgA + _Smaller * _Larger;
}

_Larger is too large to be subnormal, so scaling by 0.5 is exact, and the product _Smaller * _Larger is large enough that if _ArgA is subnormal, it will be too small to contribute anyway. So I think the two expressions should be equivalent, apart from internal overflow, so there won't be a loss of monotonicity when switching from one to the other.

statementreply · 2021-05-14T06:45:40Z

What do you think about fma? It seems it doesn't overflow but we can get different results for runtime/compiletime invocation of lerp...

Accuracy

fma(t, b - a, a) is (on average) more accurate than a + t * (b - a), and doesn't suffer from t * (b - a) overflowing as long as the final result doesn't overflow. (1 - t) * a + t * b in the other branch could also be fma(t, b, fma(-t, a, a)).
Performance

Recent CPUs (AMD since ~2012, Intel since ~2014, and all ARM that Windows arm64 supports) have hardware fma instructions. Older CPUs don't support hardware fma , so std::fma needs to be implemented in software.

On the current version of MSVC, if not compiled with /arch:AVX2 or alike, std::fma isn't inlined and generates a call into UCRT (https://godbolt.org/z/q4vbe7xqn). It then checks whether the CPU supports fma. If so, it calls the CPU instruction directly (however the function call overhead is non-trivial). If not, UCRT uses another algorithm to compute the result, which is unfortunately neither accurate (DevCom-242309) nor fast (hundreds of nanoseconds for some input values) as of ucrtbase.dll 10.0.19041.789.
Different results between compile time and runtime

There are already situations where results could differ.

<cmath> functions (exp, sin, etc.) use fma powered implementation on CPUs with hardware fma, and a different implementation without fma on CPUs without hardware fma. Code using these math functions could generate different results on different CPUs.

GCC performs constant folding (not limited to constexpr) on math functions. The compile time results could be different from the results from the runtime math library. (https://godbolt.org/z/r49WEax6E)

Also, the compiler is allowed to compile x * y + z into fma(x, y, z) (https://godbolt.org/z/zaY6veW9j), which may also lead to different results between compile time and runtime.

However, should the results differ between compile time and runtime, it's desirable that the compile time results are correctly rounded (as if computed with infinite precision and then rounded), or at least more accurate than the runtime results.

statementreply · 2021-05-14T07:26:21Z

The condition we have to handle here is limited to |t*(b-a)| ≤ 2*MEOW_MAX, right? Otherwise we'd have a true, not spurious, overflow. If that's the case, maybe we can simply scale the calculation by a fixed factor.
auto _Smaller     = _ArgT;
auto _Larger      = _ArgB - _ArgA;
auto _Abs_smaller = _Float_abs(_Smaller);
auto _Abs_larger  = _Float_abs(_Larger);
if (_Abs_larger < _Abs_smaller) {
    _STD swap(_Smaller, _Larger);
    _STD swap(_Abs_smaller, _Abs_larger);
}

if (_Abs_larger > _STD sqrt(numeric_limits<_Ty>::max()) && _Abs_smaller > 1) {
    return 2 * (_Ty{0.5} * _ArgA + _Smaller * (_Ty{0.5} * _Larger));
} else {
    return _ArgA + _Smaller * _Larger;
}
_Larger is too large to be subnormal, so scaling by 0.5 is exact, and the product _Smaller * _Larger is large enough that if _ArgA is subnormal, it will be too small to contribute anyway. So I think the two expressions should be equivalent, apart from internal overflow, so there won't be a loss of monotonicity when switching from one to the other.

Note that we have |b - a| <= MEOW_MAX in this branch, so it is limited to |t| > 1. I think we could do the following to save a few branches (untested, needs benchmark):

// precondition: T{0.5} * t is exact
T linear_for_lerp(T intercept, T slope, T t) {
    T half_prod = slope * (T{0.5} * t);
    if (abs(half_prod) <= numeric_limits<T>::max() * T{0.5}) {
        return intercept + slope * t;
    } else {
        return T{2} * (T{0.5} * intercept + half_prod);
    }
}

T lerp(T a, T b, T t) {
    // [...]
    {
        if (abs(t) < T{1}) {
            T candidate = a + t * (b - a);
            // [...] fix monotonicity
        }

        if (t >= T{1}) {
            return linear_for_lerp(b, b - a, t - T{1});
        }

        // t <= -1
        return linear_for_lerp(a, b - a, t);
    }
    // [[...]]
}

…ty()

Co-authored-by: Michael Schellenberger Costa <[email protected]>

Co-authored-by: Adam Bucior <[email protected]>

Co-authored-by: statementreply <[email protected]>

Co-authored-by: Curtis J Bezault <[email protected]>

tests/std/tests/P0811R3_midpoint_lerp/test.cpp

stl/inc/cmath

tests/std/tests/P0811R3_midpoint_lerp/test.cpp

Co-authored-by: Stephan T. Lavavej <[email protected]> Co-authored-by: Matt Stephanson <[email protected]>

stl/inc/cmath

StephanTLavavej · 2022-06-19T02:23:30Z

I'm speculatively mirroring this to the MSVC-internal repo. Further changes can be pushed, but please notify me.

CaseyCarter · 2022-06-19T02:33:35Z

stl/inc/cmath

+
+    return _STD fma(_ArgT, _ArgB - _ArgA, _ArgA);
+}
+


We need to find that Twitter thread mocking our huge implementation of lerp to let them know we've added another 25 lines. (No change requested.)

StephanTLavavej · 2022-06-20T00:46:53Z

Thanks for infinitely improving this implementation! ♾️ 🚀 😹

…osoft#1918) Co-authored-by: Adam Bucior <[email protected]> Co-authored-by: Alexander Bolz <[email protected]> Co-authored-by: Curtis J Bezault <[email protected]> Co-authored-by: Matt Stephanson <[email protected]> Co-authored-by: Michael Schellenberger Costa <[email protected]> Co-authored-by: statementreply <[email protected]> Co-authored-by: Stephan T. Lavavej <[email protected]>

fsb4000 requested a review from a team as a code owner May 11, 2021 05:14

miscco reviewed May 11, 2021

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

miscco reviewed May 11, 2021

View reviewed changes

tests/std/tests/P0811R3_midpoint_lerp/test.cpp Outdated Show resolved Hide resolved

AdamBucior reviewed May 11, 2021

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

statementreply reviewed May 11, 2021

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

statementreply reviewed May 12, 2021

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

stl/inc/cmath Outdated Show resolved Hide resolved

StephanTLavavej added the bug Something isn't working label May 12, 2021

cbezault reviewed May 12, 2021

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

statementreply reviewed May 13, 2021

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

stl/inc/cmath Outdated Show resolved Hide resolved

stl/inc/cmath Outdated Show resolved Hide resolved

fsb4000 commented May 13, 2021

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

fsb4000 and others added 12 commits July 4, 2021 08:30

Recalculate lerp if we got infinity. Eliminates some overflows.

f762d9d

rename test method and change INFINITY to numeric_limits<_Ty>::infini…

96ef790

…ty()

fix more tests

337fc81

remove unneeded _STD

4cb06ef

Co-authored-by: Michael Schellenberger Costa <[email protected]>

stl already has constexpr isinf

1d4c290

Co-authored-by: Adam Bucior <[email protected]>

Clear FE_OVERFLOW exception flag

bbfb060

include <cfenv> instead of <fenv.h>

13c578f

Check different rounding modes

3e7ab76

add some comments

7384454

Exception flags could have already been set before calling lerp

18fd0ed

Co-authored-by: statementreply <[email protected]>

use default ctor and dtor

c06a9d3

use existing code for constexpr signbit

c5404ee

Co-authored-by: Curtis J Bezault <[email protected]>

strega-nil-ms assigned strega-nil-ms and StephanTLavavej and unassigned StephanTLavavej and strega-nil-ms May 2, 2022

fsb4000 added 2 commits May 7, 2022 19:46

Merge branch 'main' into fix1917 and fix conflicts

19bd20e

change test name

838cf8f

StephanTLavavej requested changes Jun 13, 2022

View reviewed changes

tests/std/tests/P0811R3_midpoint_lerp/test.cpp Show resolved Hide resolved

stl/inc/cmath Outdated Show resolved Hide resolved

tests/std/tests/P0811R3_midpoint_lerp/test.cpp Outdated Show resolved Hide resolved

StephanTLavavej removed their assignment Jun 13, 2022

fsb4000 added 2 commits June 16, 2022 00:28

add a test for float and improve the estimate of the inequality

9f2d248

Merge branch 'main' into fix1917

26ed8e2

StephanTLavavej self-assigned this Jun 15, 2022

StephanTLavavej approved these changes Jun 16, 2022

View reviewed changes

StephanTLavavej removed their assignment Jun 16, 2022

simplify the condition

f986102

Co-authored-by: Stephan T. Lavavej <[email protected]> Co-authored-by: Matt Stephanson <[email protected]>

StephanTLavavej reviewed Jun 17, 2022

View reviewed changes

stl/inc/cmath Outdated Show resolved Hide resolved

Move comment within the branch it describes.

77adb6c

StephanTLavavej approved these changes Jun 17, 2022

View reviewed changes

StephanTLavavej self-assigned this Jun 19, 2022

CaseyCarter approved these changes Jun 19, 2022

View reviewed changes

StephanTLavavej merged commit 8e5d0aa into microsoft:main Jun 20, 2022

fsb4000 deleted the fix1917 branch June 20, 2022 04:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recalculate lerp if we got infinity. Eliminates some overflows. #1918

Recalculate lerp if we got infinity. Eliminates some overflows. #1918

fsb4000 commented May 11, 2021

fsb4000 commented May 12, 2021

statementreply commented May 12, 2021 •

edited

Loading

abolz commented May 12, 2021

fsb4000 commented May 13, 2021

MattStephanson commented May 14, 2021

statementreply commented May 14, 2021 •

edited

Loading

statementreply commented May 14, 2021 •

edited

Loading

StephanTLavavej commented Jun 19, 2022

CaseyCarter Jun 19, 2022

StephanTLavavej commented Jun 20, 2022

Recalculate lerp if we got infinity. Eliminates some overflows. #1918

Recalculate lerp if we got infinity. Eliminates some overflows. #1918

Conversation

fsb4000 commented May 11, 2021

fsb4000 commented May 12, 2021

statementreply commented May 12, 2021 • edited Loading

abolz commented May 12, 2021

fsb4000 commented May 13, 2021

MattStephanson commented May 14, 2021

statementreply commented May 14, 2021 • edited Loading

statementreply commented May 14, 2021 • edited Loading

StephanTLavavej commented Jun 19, 2022

CaseyCarter Jun 19, 2022

Choose a reason for hiding this comment

StephanTLavavej commented Jun 20, 2022

statementreply commented May 12, 2021 •

edited

Loading

statementreply commented May 14, 2021 •

edited

Loading

statementreply commented May 14, 2021 •

edited

Loading