lint long chains of into/float additions/multiplications #1243

llogiq · 2016-09-25T03:49:19Z

In generall, (a+b+c)+(d+e+f) is faster to compute than a+b+c+d+e+f, because the former reduces data dependencies, allowing the CPU to run additions in parallel. For float arithmetic, the tree-like addition may improve numerical stability. Suggest inserting parenthesis.

The text was updated successfully, but these errors were encountered:

clarfonthey · 2016-12-05T23:29:12Z

Should this be smart and notice add-assign too? i.e.:

let val = a;
val += b;
val += c;
val += d;
val += e;
val += f;

Obviously an esoteric example. But if the floats all come from different functions, it might not be as noticeable.

felix91gr · 2019-03-02T08:03:03Z

Isn't this optimized already by the compiler in integer arithmetic? Since it is - save for overflow errors - commutative and associative.

Float addition and substraction are not associative (and mul / div are, but only sometimes). Therefore, this lint would still be a good help for floating point code.

What'd be the threshold of the chains? Should it be fixed, or configurable? Should it be the same for addition and multiplication?

Manishearth · 2019-03-02T08:52:50Z

This may be? I'm not sure. It's worth playing with http://rust.godbolt.org/ to see if that's the case.

Sometimes lints that provide "optimizations" also provide clarity of code, so even if the compiler is smart enough we still have them. But in this case the parens make it worse so we should only add this if it really has a perf benefit.

felix91gr · 2019-03-03T13:35:05Z

Okay, I'm trying this with varying levels of variables to see how much the compiler can optimize it.

ATM I'm using LLVM IR code generation because is the one I know how to read, however, this might not be the best idea since I don't know if parallel (vectorized) computation is available at the IR level. Please tell me what should I use instead if that's the case.

Setup

This is the compiler command: -C opt-level=3 --emit llvm-ir

These are the options I've activated in the visualizer:

Integer arithmetic

Here are the tests I've done with integer arithmetic chains

Sum(i=1..n)

n = 4

Started with 4 i32 to see how LLVM handles these chains.

From this,

pub fn sum(a: i32, b:i32, c: i32, d: i32) -> i32 {
    a + b + c + d
}

Got this

define i32 @_ZN7example3sum17hc20c6e893a82959fE(i32 %a, i32 %b, i32 %c, i32 %d) unnamed_addr #0 !dbg !5 {
  %0 = add i32 %b, %a, !dbg !8
  %1 = add i32 %0, %c, !dbg !8
  %2 = add i32 %1, %d, !dbg !8
  ret i32 %2, !dbg !9
}

Changing it into this,

pub fn sum(a: i32, b:i32, c: i32, d: i32) -> i32 {
    (a + b) + (c + d)
}

Produced only a small change at IR level. It changes the source debug locations, but that's it.

define i32 @_ZN7example3sum17hc20c6e893a82959fE(i32 %a, i32 %b, i32 %c, i32 %d) unnamed_addr #0 !dbg !5 {
  %0 = add i32 %b, %a, !dbg !8
  %1 = add i32 %0, %c, !dbg !9
  %2 = add i32 %1, %d, !dbg !9
  ret i32 %2, !dbg !10
}

n = 8

If with 8 i32 the IR is still a linear progression, then I'd assume that LLVM just doesn't care where do I put my parenthesis.

Turns out, it doesn't care indeed. The following code, with and without parenthesis, generates the same IR (save for the source debug metadata)

pub fn sum(
            a: i32, b: i32, c: i32, d: i32, 
            e: i32, f: i32, g: i32, h: i32) -> i32 {
    ((a + b) + (c + d)) + ((e + f) + (g + h))
}

The generated IR is down below

define i32 @_ZN7example3sum17ha68f7b612674b2b3E(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h) unnamed_addr #0 !dbg !5 {
  %0 = add i32 %b, %a, !dbg !8
  %1 = add i32 %0, %c, !dbg !9
  %2 = add i32 %1, %d, !dbg !10
  %3 = add i32 %2, %e, !dbg !11
  %4 = add i32 %3, %f, !dbg !12
  %5 = add i32 %4, %g, !dbg !13
  %6 = add i32 %5, %h, !dbg !13
  ret i32 %6, !dbg !14
}

Mul(i=1..n)

Let's try multiplication, then.

With n = 4, this being the source code:

pub fn mul(a: i32, b: i32, c: i32, d: i32) -> i32 {
    a * b * c * d
}

This is the generated IR

define i32 @_ZN7example3mul17h6b4eea647f116674E(i32 %a, i32 %b, i32 %c, i32 %d) unnamed_addr #0 !dbg !5 {
  %0 = mul i32 %b, %a, !dbg !8
  %1 = mul i32 %0, %c, !dbg !8
  %2 = mul i32 %1, %d, !dbg !8
  ret i32 %2, !dbg !9
}

Added parenthesis:

pub fn mul(a: i32, b: i32, c: i32, d: i32) -> i32 {
    (a * b) * (c * d)
}

But he generated IR stayed the same.

define i32 @_ZN7example3mul17h6b4eea647f116674E(i32 %a, i32 %b, i32 %c, i32 %d) unnamed_addr #0 !dbg !5 {
  %0 = mul i32 %b, %a, !dbg !8
  %1 = mul i32 %0, %c, !dbg !9
  %2 = mul i32 %1, %d, !dbg !9
  ret i32 %2, !dbg !10
}

I'll assume then, that testing it with 8 ints isn't worth it. Integer arithmetic seems to be transparent to LLVM. Let's move on to floats.

felix91gr · 2019-03-03T14:08:03Z

Floating point arithmetic

This is where things got interesting!

Let's first do 4 parameters.

n = 4

Here's the addition code:

pub fn add_4(a: f32, b: f32, c: f32, d: f32) -> f32 {
    a + b + c + d
}

pub fn add_4_p(a: f32, b: f32, c: f32, d: f32) -> f32 {
    (a + b) + (c + d)
}

The generated IR was different!

define float @_ZN7example5add_417hbd9627e19913a4a5E(float %a, float %b, float %c, float %d) unnamed_addr #0 !dbg !5 {
  %0 = fadd float %a, %b, !dbg !8
  %1 = fadd float %0, %c, !dbg !8
  %2 = fadd float %1, %d, !dbg !8
  ret float %2, !dbg !9
}

Here without parenthesis we can see a linear progression through the chain.

define float @_ZN7example7add_4_p17h45fdde80c80ae9f4E(float %a, float %b, float %c, float %d) unnamed_addr #0 !dbg !10 {
  %0 = fadd float %a, %b, !dbg !11
  %1 = fadd float %c, %d, !dbg !12
  %2 = fadd float %0, %1, !dbg !11
  ret float %2, !dbg !13
}

With parenthesis we can see a tree approach to the operations! Which makes a lot of sense: LLVM shouldn't be "normalizing" the order of operations, if doing so changes the end result. Which for floating points, it definitely would!

The multiplication code's results are strikingly similar.

pub fn mul_4(a: f32, b: f32, c: f32, d: f32) -> f32 {
    a * b * c * d
}

pub fn mul_4_p(a: f32, b: f32, c: f32, d: f32) -> f32 {
    (a * b) * (c * d)
}

define float @_ZN7example5mul_417h978c4d41e342ced3E(float %a, float %b, float %c, float %d) unnamed_addr #0 !dbg !22 {
  %0 = fmul float %a, %b, !dbg !23
  %1 = fmul float %0, %c, !dbg !23
  %2 = fmul float %1, %d, !dbg !23
  ret float %2, !dbg !24
}

define float @_ZN7example7mul_4_p17h539935c5f42320aeE(float %a, float %b, float %c, float %d) unnamed_addr #0 !dbg !25 {
  %0 = fmul float %a, %b, !dbg !26
  %1 = fmul float %c, %d, !dbg !27
  %2 = fmul float %0, %1, !dbg !26
  ret float %2, !dbg !28

n = 8

Now things got really interesting.

Here's the addition operations:

pub fn add_8(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32) -> f32 {
    a + b + c + d + e + f + g + h
}

pub fn add_8_p(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32) -> f32 {
    ((a + b) + (c + d)) + ((e + f) + (g + h))
}

And the generated IR:

define float @_ZN7example5add_817hef9e967ef9fb2f7eE(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h) unnamed_addr #0 !dbg !14 {
  %0 = fadd float %a, %b, !dbg !15
  %1 = fadd float %0, %c, !dbg !15
  %2 = fadd float %1, %d, !dbg !15
  %3 = fadd float %2, %e, !dbg !15
  %4 = fadd float %3, %f, !dbg !15
  %5 = fadd float %4, %g, !dbg !15
  %6 = fadd float %5, %h, !dbg !15
  ret float %6, !dbg !16
}

define float @_ZN7example7add_8_p17hf411fbb0fd42e0a7E(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h) unnamed_addr #0 !dbg !17 {
  %0 = insertelement <2 x float> undef, float %a, i32 0, !dbg !18
  %1 = insertelement <2 x float> %0, float %e, i32 1, !dbg !18
  %2 = insertelement <2 x float> undef, float %b, i32 0, !dbg !18
  %3 = insertelement <2 x float> %2, float %f, i32 1, !dbg !18
  %4 = fadd <2 x float> %1, %3, !dbg !18
  %5 = insertelement <2 x float> undef, float %c, i32 0, !dbg !19
  %6 = insertelement <2 x float> %5, float %g, i32 1, !dbg !19
  %7 = insertelement <2 x float> undef, float %d, i32 0, !dbg !19
  %8 = insertelement <2 x float> %7, float %h, i32 1, !dbg !19
  %9 = fadd <2 x float> %6, %8, !dbg !19
  %10 = fadd <2 x float> %4, %9, !dbg !20
  %11 = extractelement <2 x float> %10, i32 0, !dbg !20
  %12 = extractelement <2 x float> %10, i32 1, !dbg !20
  %13 = fadd float %11, %12, !dbg !20
  ret float %13, !dbg !21
}

What LLVM is doing in the second case is vector operations. I've checked the generated code (please feel free to verify it as well) and it does indeed exactly what the parenthesis say. The cool thing here is that it only does 4 addition operations instead of 7, which if assuming insertelement and extractelement are much smaller in cost, is actually more efficient! :)

We can see an exact parallel between the addition code and the multiplication code's behavior when it comes to vectorization:

pub fn mul_8(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32) -> f32 {
    a * b * c * d * e * f * g * h
}

pub fn mul_8_p(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32) -> f32 {
    ((a * b) * (c * d)) * ((e * f) * (g * h))
}

define float @_ZN7example5mul_817he6344433229b5d87E(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h) unnamed_addr #0 !dbg !29 {
  %0 = fmul float %a, %b, !dbg !30
  %1 = fmul float %0, %c, !dbg !30
  %2 = fmul float %1, %d, !dbg !30
  %3 = fmul float %2, %e, !dbg !30
  %4 = fmul float %3, %f, !dbg !30
  %5 = fmul float %4, %g, !dbg !30
  %6 = fmul float %5, %h, !dbg !30
  ret float %6, !dbg !31
}

define float @_ZN7example7mul_8_p17h7be0422ddf0675e3E(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h) unnamed_addr #0 !dbg !32 {
  %0 = insertelement <2 x float> undef, float %a, i32 0, !dbg !33
  %1 = insertelement <2 x float> %0, float %e, i32 1, !dbg !33
  %2 = insertelement <2 x float> undef, float %b, i32 0, !dbg !33
  %3 = insertelement <2 x float> %2, float %f, i32 1, !dbg !33
  %4 = fmul <2 x float> %1, %3, !dbg !33
  %5 = insertelement <2 x float> undef, float %c, i32 0, !dbg !34
  %6 = insertelement <2 x float> %5, float %g, i32 1, !dbg !34
  %7 = insertelement <2 x float> undef, float %d, i32 0, !dbg !34
  %8 = insertelement <2 x float> %7, float %h, i32 1, !dbg !34
  %9 = fmul <2 x float> %6, %8, !dbg !34
  %10 = fmul <2 x float> %4, %9, !dbg !35
  %11 = extractelement <2 x float> %10, i32 0, !dbg !35
  %12 = extractelement <2 x float> %10, i32 1, !dbg !35
  %13 = fmul float %11, %12, !dbg !35
  ret float %13, !dbg !36
}

Conclusions

This is very interesting. Since for n=4 it didn't vectorize, and knowing how performance-focused is LLVM, I think we can assume that the costs of vectorization are higher than the benefits at some point when going down from n=8 to n=4.

Of course, this asks for two things... benchmarking, and checking what's the codegen for n = 16. Guess which one I'm gonna do now! :D

felix91gr · 2019-03-03T14:30:23Z

Bonus Track: n = 16

Well, it had to be done. Here are the results.

Original code

pub fn add_8(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32,
    i: f32, j: f32, k: f32, l: f32,
    m: f32, n: f32, o: f32, p: f32) -> f32 {
    a + b + c + d + e + f + g + h +
    i + j + k + l + m + n + o + p
}

pub fn add_8_p(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32,
    i: f32, j: f32, k: f32, l: f32,
    m: f32, n: f32, o: f32, p: f32) -> f32 {
    (((a + b) + (c + d)) + ((e + f) + (g + h))) +
    (((i + j) + (k + l)) + ((m + n) + (o + p)))
}

pub fn mul_8(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32,
    i: f32, j: f32, k: f32, l: f32,
    m: f32, n: f32, o: f32, p: f32) -> f32 {
    a * b * c * d * e * f * g * h *
    i * j * k * l * m * n * o * p
}

pub fn mul_8_p(
    a: f32, b: f32, c: f32, d: f32,
    e: f32, f: f32, g: f32, h: f32,
    i: f32, j: f32, k: f32, l: f32,
    m: f32, n: f32, o: f32, p: f32) -> f32 {
    (((a * b) * (c * d)) * ((e * f) * (g * h))) *
    (((i * j) * (k * l)) * ((m * n) * (o * p)))
}

Generated IR

define float @_ZN7example5add_817hfacaa9e4a302cdeaE(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h, float %i, float %j, float %k, float %l, float %m, float %n, float %o, float %p) unnamed_addr #0 !dbg !5 {
  %0 = fadd float %a, %b, !dbg !8
  %1 = fadd float %0, %c, !dbg !8
  %2 = fadd float %1, %d, !dbg !8
  %3 = fadd float %2, %e, !dbg !8
  %4 = fadd float %3, %f, !dbg !8
  %5 = fadd float %4, %g, !dbg !8
  %6 = fadd float %5, %h, !dbg !8
  %7 = fadd float %6, %i, !dbg !8
  %8 = fadd float %7, %j, !dbg !8
  %9 = fadd float %8, %k, !dbg !8
  %10 = fadd float %9, %l, !dbg !8
  %11 = fadd float %10, %m, !dbg !8
  %12 = fadd float %11, %n, !dbg !8
  %13 = fadd float %12, %o, !dbg !8
  %14 = fadd float %13, %p, !dbg !8
  ret float %14, !dbg !9
}

define float @_ZN7example7add_8_p17hf34e2f52a58a2f66E(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h, float %i, float %j, float %k, float %l, float %m, float %n, float %o, float %p) unnamed_addr #0 !dbg !10 {
  %0 = insertelement <2 x float> undef, float %a, i32 0, !dbg !11
  %1 = insertelement <2 x float> %0, float %i, i32 1, !dbg !11
  %2 = insertelement <2 x float> undef, float %b, i32 0, !dbg !11
  %3 = insertelement <2 x float> %2, float %j, i32 1, !dbg !11
  %4 = fadd <2 x float> %1, %3, !dbg !11
  %5 = insertelement <2 x float> undef, float %c, i32 0, !dbg !12
  %6 = insertelement <2 x float> %5, float %k, i32 1, !dbg !12
  %7 = insertelement <2 x float> undef, float %d, i32 0, !dbg !12
  %8 = insertelement <2 x float> %7, float %l, i32 1, !dbg !12
  %9 = fadd <2 x float> %6, %8, !dbg !12
  %10 = fadd <2 x float> %4, %9, !dbg !13
  %11 = insertelement <2 x float> undef, float %e, i32 0, !dbg !14
  %12 = insertelement <2 x float> %11, float %m, i32 1, !dbg !14
  %13 = insertelement <2 x float> undef, float %f, i32 0, !dbg !14
  %14 = insertelement <2 x float> %13, float %n, i32 1, !dbg !14
  %15 = fadd <2 x float> %12, %14, !dbg !14
  %16 = insertelement <2 x float> undef, float %g, i32 0, !dbg !15
  %17 = insertelement <2 x float> %16, float %o, i32 1, !dbg !15
  %18 = insertelement <2 x float> undef, float %h, i32 0, !dbg !15
  %19 = insertelement <2 x float> %18, float %p, i32 1, !dbg !15
  %20 = fadd <2 x float> %17, %19, !dbg !15
  %21 = fadd <2 x float> %15, %20, !dbg !16
  %22 = fadd <2 x float> %10, %21, !dbg !17
  %23 = extractelement <2 x float> %22, i32 0, !dbg !17
  %24 = extractelement <2 x float> %22, i32 1, !dbg !17
  %25 = fadd float %23, %24, !dbg !17
  ret float %25, !dbg !18
}

define float @_ZN7example5mul_817he802d2a8e2d2b3afE(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h, float %i, float %j, float %k, float %l, float %m, float %n, float %o, float %p) unnamed_addr #0 !dbg !19 {
  %0 = fmul float %a, %b, !dbg !20
  %1 = fmul float %0, %c, !dbg !20
  %2 = fmul float %1, %d, !dbg !20
  %3 = fmul float %2, %e, !dbg !20
  %4 = fmul float %3, %f, !dbg !20
  %5 = fmul float %4, %g, !dbg !20
  %6 = fmul float %5, %h, !dbg !20
  %7 = fmul float %6, %i, !dbg !20
  %8 = fmul float %7, %j, !dbg !20
  %9 = fmul float %8, %k, !dbg !20
  %10 = fmul float %9, %l, !dbg !20
  %11 = fmul float %10, %m, !dbg !20
  %12 = fmul float %11, %n, !dbg !20
  %13 = fmul float %12, %o, !dbg !20
  %14 = fmul float %13, %p, !dbg !20
  ret float %14, !dbg !21
}

define float @_ZN7example7mul_8_p17h4ad349c7d77b6c6cE(float %a, float %b, float %c, float %d, float %e, float %f, float %g, float %h, float %i, float %j, float %k, float %l, float %m, float %n, float %o, float %p) unnamed_addr #0 !dbg !22 {
  %0 = insertelement <2 x float> undef, float %a, i32 0, !dbg !23
  %1 = insertelement <2 x float> %0, float %i, i32 1, !dbg !23
  %2 = insertelement <2 x float> undef, float %b, i32 0, !dbg !23
  %3 = insertelement <2 x float> %2, float %j, i32 1, !dbg !23
  %4 = fmul <2 x float> %1, %3, !dbg !23
  %5 = insertelement <2 x float> undef, float %c, i32 0, !dbg !24
  %6 = insertelement <2 x float> %5, float %k, i32 1, !dbg !24
  %7 = insertelement <2 x float> undef, float %d, i32 0, !dbg !24
  %8 = insertelement <2 x float> %7, float %l, i32 1, !dbg !24
  %9 = fmul <2 x float> %6, %8, !dbg !24
  %10 = fmul <2 x float> %4, %9, !dbg !25
  %11 = insertelement <2 x float> undef, float %e, i32 0, !dbg !26
  %12 = insertelement <2 x float> %11, float %m, i32 1, !dbg !26
  %13 = insertelement <2 x float> undef, float %f, i32 0, !dbg !26
  %14 = insertelement <2 x float> %13, float %n, i32 1, !dbg !26
  %15 = fmul <2 x float> %12, %14, !dbg !26
  %16 = insertelement <2 x float> undef, float %g, i32 0, !dbg !27
  %17 = insertelement <2 x float> %16, float %o, i32 1, !dbg !27
  %18 = insertelement <2 x float> undef, float %h, i32 0, !dbg !27
  %19 = insertelement <2 x float> %18, float %p, i32 1, !dbg !27
  %20 = fmul <2 x float> %17, %19, !dbg !27
  %21 = fmul <2 x float> %15, %20, !dbg !28
  %22 = fmul <2 x float> %10, %21, !dbg !29
  %23 = extractelement <2 x float> %22, i32 0, !dbg !29
  %24 = extractelement <2 x float> %22, i32 1, !dbg !29
  %25 = fmul float %23, %24, !dbg !29
  ret float %25, !dbg !30
}

As you can see, vectorization is still the compiler's choice, and the amount of operations in the "operation tree" is still 1/2 + 1 that of the "operation chain"'s IR.

Conclusions v2

I don't know what the verdict would be. From the n=4 code, or by hand, we can see there is a way of processing the parenthesis tree with the same amount of operations that the chain would need. Here's how to do it for n = 8:

Chain of a x b x c x d x e x f x g x h:

%0 = fmul float %a, %b
%1 = fmul float %0, %c
%2 = fmul float %1, %d
%3 = fmul float %2, %e
%4 = fmul float %3, %f
%5 = fmul float %4, %g
%6 = fmul float %5, %h

Tree of ((a x b) x (c x d)) x ((e x f) x (g x h))

%0 = fmul float %a, %b
%1 = fmul float %c, %d
%2 = fmul float %e, %f
%3 = fmul float %g, %h
%4 = fmul float %0, %1
%5 = fmul float %2, %3
%6 = fmul float %4, %5

However, for n = 8 and onwards, LLVM seems to prefer vectorization. That would mean that the vectorized processing of the tree is faster than the linear processing of it. And since the linear processing of the tree is of the same cost as the linear processing of the chain, that would mean that the vectorized processing of the tree is faster than the linear processing of the chain.

We might need to make some benchmarks. But I think the optimization potential of converting the "chain" into a "tree" is real. Otherwise LLVM wouldn't be making that transformation.

I'd implement the lint, but for floating points only, and from chains of size n, with n > 4, n <= 8 and onwards.

felix91gr · 2019-03-03T22:21:44Z

Summary

To summarize my km-length posts: (context: -O3 level of optimization, LLVM codegen)

This op. chain -> op. tree change seems to have no effect on integer operations, which makes sense, given the semantics of those types :D
The change does have an effect on floating point operations, which also makes sense, given floating point arithmetic is non-associative.
There is a number of variables in a chain, between 5 and 8, where LLVM vectorizes the operations of the tree.
This implies that there is indeed a performance improvement from applying this change. Otherwise LLVM would not vectorize the tree ops.
Given (4), there is an incentive to (a) find out the magic length of a chain where the vectorization appears, and (b) implement this lint, for that length and above.

felix91gr · 2019-03-03T22:29:37Z

Thoughts on numerical stability

Since floating point ops are always lossy, Reducing the number of operations is always good.

However, vectorization does not reduce the number of operations, it only parallelizes them. And it can be seen while writing parenthesis that one's never deleting any asterisks nor pluses (*, +): we're only putting the parenthesis in between. Therefore, the number of operations stays the same.

That's why I think that this lint, while definitely helping with performance, would not help with stability. What do y'alls think? :)

felix91gr · 2019-03-05T18:44:26Z

Updated my comments so that they are not huge, intimidating walls of code. Now the code is mostly in "spoiler blocks" ^^

felix91gr · 2019-03-06T17:39:18Z

Update: seems like the change from linear to vectorized occurs exactly at the amount of 8 variables. I haven't been able to make LLVM vectorize the chain with less variables, regardless of how I group the operations. 8 might be the "magic number" we ought to look for in this lint.

oli-obk added good-first-issue These issues are a good way to get started with Clippy T-AST Type: Requires working with the AST A-lint Area: New lints L-perf Lint: Belongs in the perf lint group labels Sep 25, 2016

Manishearth added the hacktoberfest label Sep 28, 2017

flip1995 mentioned this issue Jan 15, 2020

Implement issue finder for lint names #5049

Closed

flip1995 removed the hacktoberfest label Oct 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lint long chains of into/float additions/multiplications #1243

lint long chains of into/float additions/multiplications #1243

llogiq commented Sep 25, 2016 •

edited by mcarton

Loading

clarfonthey commented Dec 5, 2016

felix91gr commented Mar 2, 2019 •

edited

Loading

Manishearth commented Mar 2, 2019

felix91gr commented Mar 3, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019

felix91gr commented Mar 5, 2019

felix91gr commented Mar 6, 2019

lint long chains of into/float additions/multiplications #1243

lint long chains of into/float additions/multiplications #1243

Comments

llogiq commented Sep 25, 2016 • edited by mcarton Loading

clarfonthey commented Dec 5, 2016

felix91gr commented Mar 2, 2019 • edited Loading

Manishearth commented Mar 2, 2019

felix91gr commented Mar 3, 2019 • edited Loading

Setup

Integer arithmetic

Sum(i=1..n)

n = 4

n = 8

Mul(i=1..n)

felix91gr commented Mar 3, 2019 • edited Loading

Floating point arithmetic

n = 4

n = 8

Conclusions

felix91gr commented Mar 3, 2019 • edited Loading

Bonus Track: n = 16

Conclusions v2

felix91gr commented Mar 3, 2019 • edited Loading

Summary

felix91gr commented Mar 3, 2019

Thoughts on numerical stability

felix91gr commented Mar 5, 2019

felix91gr commented Mar 6, 2019

llogiq commented Sep 25, 2016 •

edited by mcarton

Loading

felix91gr commented Mar 2, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019 •

edited

Loading

felix91gr commented Mar 3, 2019 •

edited

Loading