optimize append #2164

moiwi · 2021-03-05T08:56:32Z

Avoid one unneeded empty loop pass, e.g. with

std::string message = fmt::sprintf("The answer is %d", 42);

(mostly if % placeholder is at the beginning or end of the string to be formatted)

I agree that my contributions are licensed under the {fmt} license, and agree to future changes to the licensing.

optimize loop

vitaut · 2021-03-10T19:01:20Z

Thanks for the PR. Could you post a benchmark demonstrating the effect of this change in the comments here?

rimathia · 2021-03-11T17:02:55Z

I've performed the following elementary benchmark:
fmt::format("{}", 'a') (and the corresponding case with 5 and 10 arguments)
The benchmark parameter introduces N non-format characters after "{}". N != 0 is mostly to check that the "normal" case isn't affected in an unexpected way.
The expected improvement for N=0 shows up for the five-argument case for gcc and in the five- and ten-argument case for clang.
I've only checked the reproducibility of the benchmark results by manual comparison of runs.

gcc (g++-10 (Homebrew GCC 10.2.0_4) 10.2.0):

cd ../fmt/ && git checkout e718ec3e93 && git log -1 && cd ../build_gcc && make dowhileopt && ./dowhileopt
HEAD is now at e718ec3e Make truncating_iterator an output_iterator (#2158)
commit e718ec3e93dc75598bfbcdaf492554ae2114ec19 (HEAD, moiwi/master)
Author: Jason Cobb <[email protected]>
Date:   Thu Mar 4 18:53:08 2021 -0500

    Make truncating_iterator an output_iterator (#2158)
<build cut out>
2021-03-11 17:25:45
Running ./dowhileopt
Run on (16 X 2400 MHz CPU s)
CPU Caches:
  L1 Data 32K (x8)
  L1 Instruction 32K (x8)
  L2 Unified 262K (x8)
  L3 Unified 16777K (x1)
Load Average: 2.15, 2.08, 1.86
---------------------------------------------------------------------
Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
one_format_argument/0            7.77 ns         7.77 ns     80873433
one_format_argument/10           29.2 ns         29.2 ns     23374863
one_format_argument/100          90.5 ns         90.5 ns      7942361
one_format_argument/1000          256 ns          256 ns      2957542
five_format_arguments/0          62.5 ns         62.5 ns     11517325
five_format_arguments/10          144 ns          144 ns      4760383
five_format_arguments/100         307 ns          307 ns      2168290
five_format_arguments/1000        976 ns          976 ns       655977
ten_format_arguments/0            111 ns          111 ns      6218298
ten_format_arguments/10           264 ns          264 ns      2778362
ten_format_arguments/100          458 ns          458 ns      1462666
ten_format_arguments/1000        1782 ns         1782 ns       414768
Mathiass-MacBook-Pro-2:build_gcc mathiasritzmann$ cd ../fmt/ && git checkout b3e6d && git log -1 && cd ../build_gcc && make dowhileopt && ./dowhileopt
Previous HEAD position was e718ec3e Make truncating_iterator an output_iterator (#2158)
HEAD is now at b3e6d017 Update format.h
commit b3e6d017a5045e7d6a5828bc0a59066da03c216a (HEAD, moiwi/moiwi-opt)
Author: moiwi <[email protected]>
Date:   Fri Mar 5 09:52:54 2021 +0100

    Update format.h
    
    optimize loop
<build cut out>
2021-03-11 17:26:25
Running ./dowhileopt
Run on (16 X 2400 MHz CPU s)
CPU Caches:
  L1 Data 32K (x8)
  L1 Instruction 32K (x8)
  L2 Unified 262K (x8)
  L3 Unified 16777K (x1)
Load Average: 1.76, 1.99, 1.83
---------------------------------------------------------------------
Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
one_format_argument/0            7.64 ns         7.63 ns     82745251
one_format_argument/10           28.8 ns         28.8 ns     25440576
one_format_argument/100          84.4 ns         84.4 ns      8184359
one_format_argument/1000          244 ns          244 ns      2932625
five_format_arguments/0          55.7 ns         55.7 ns     10177821
five_format_arguments/10          139 ns          139 ns      4978840
five_format_arguments/100         316 ns          316 ns      2271562
five_format_arguments/1000        945 ns          945 ns       722797
ten_format_arguments/0            109 ns          109 ns      6446681
ten_format_arguments/10           252 ns          252 ns      2669351
ten_format_arguments/100          469 ns          469 ns      1511367
ten_format_arguments/1000        1694 ns         1694 ns       413663

clang (Apple clang version 12.0.0 (clang-1200.0.32.29)):

cd ../fmt/ && git checkout e718ec3e93 && git log -1 && cd ../build_clang && make dowhileopt && ./dowhileopt
Previous HEAD position was b3e6d017 Update format.h
HEAD is now at e718ec3e Make truncating_iterator an output_iterator (#2158)
commit e718ec3e93dc75598bfbcdaf492554ae2114ec19 (HEAD, moiwi/master)
Author: Jason Cobb <[email protected]>
Date:   Thu Mar 4 18:53:08 2021 -0500

    Make truncating_iterator an output_iterator (#2158)
<build cut out>
2021-03-11 17:36:30
Running ./dowhileopt
Run on (16 X 2400 MHz CPU s)
CPU Caches:
  L1 Data 32K (x8)
  L1 Instruction 32K (x8)
  L2 Unified 262K (x8)
  L3 Unified 16777K (x1)
Load Average: 1.66, 1.74, 1.76
---------------------------------------------------------------------
Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
one_format_argument/0            4.26 ns         4.26 ns    157338374
one_format_argument/10           26.9 ns         26.9 ns     27105518
one_format_argument/100          87.7 ns         87.7 ns      7845950
one_format_argument/1000          266 ns          266 ns      2863302
five_format_arguments/0          45.8 ns         45.8 ns     15294262
five_format_arguments/10          155 ns          155 ns      4683339
five_format_arguments/100         313 ns          313 ns      2212935
five_format_arguments/1000       1062 ns         1062 ns       685019
ten_format_arguments/0           86.0 ns         86.0 ns      7977026
ten_format_arguments/10           255 ns          255 ns      2765093
ten_format_arguments/100          512 ns          512 ns      1022570
ten_format_arguments/1000        1846 ns         1846 ns       378706
Mathiass-MacBook-Pro-2:build_clang mathiasritzmann$ cd ../fmt/ && git checkout b3e6d && git log -1 && cd ../build_clang && make dowhileopt && ./dowhileopt
Previous HEAD position was e718ec3e Make truncating_iterator an output_iterator (#2158)
HEAD is now at b3e6d017 Update format.h
commit b3e6d017a5045e7d6a5828bc0a59066da03c216a (HEAD, moiwi/moiwi-opt)
Author: moiwi <[email protected]>
Date:   Fri Mar 5 09:52:54 2021 +0100

    Update format.h
    
    optimize loop
<build cut out>
2021-03-11 17:37:19
Running ./dowhileopt
Run on (16 X 2400 MHz CPU s)
CPU Caches:
  L1 Data 32K (x8)
  L1 Instruction 32K (x8)
  L2 Unified 262K (x8)
  L3 Unified 16777K (x1)
Load Average: 1.68, 1.73, 1.75
---------------------------------------------------------------------
Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
one_format_argument/0            4.36 ns         4.36 ns    128839889
one_format_argument/10           26.7 ns         26.7 ns     26903417
one_format_argument/100          88.2 ns         88.2 ns      7920523
one_format_argument/1000          261 ns          261 ns      2792951
five_format_arguments/0          42.3 ns         42.3 ns     16470123
five_format_arguments/10          156 ns          156 ns      4676081
five_format_arguments/100         314 ns          314 ns      2210008
five_format_arguments/1000       1069 ns         1069 ns       687251
ten_format_arguments/0           71.9 ns         71.9 ns      9592458
ten_format_arguments/10           252 ns          252 ns      2639756
ten_format_arguments/100          528 ns          528 ns      1000000
ten_format_arguments/1000        1837 ns         1837 ns       370500

here's the benchmark code:

#include <benchmark/benchmark.h>
#include <fmt/format.h>
#include <fmt/printf.h>

void one_format_argument(benchmark::State& state) {
  const auto format_string = "{}" + std::string(state.range(0), 'x');
  const auto test_output = fmt::format(format_string, 'a');
  if (test_output != "a" + std::string(state.range(0), 'x')) {
    throw std::runtime_error("wrong");
  }
  for (auto _ : state) {
    benchmark::DoNotOptimize(fmt::format(format_string, 'a', 'a'));
  }
}

void five_format_arguments(benchmark::State& state) {
  auto format_string = std::string();
  for (int i = 0; i < 5; ++i) {
    format_string += "{}" + std::string(state.range(0), 'x');
  }
  const auto test_output = fmt::format(format_string, 'a', 'a', 'a', 'a', 'a');
  if (state.range(0) == 0 && test_output != std::string(5, 'a')) {
    throw std::runtime_error("wrong");
  }
  for (auto _ : state) {
    benchmark::DoNotOptimize(
        fmt::format(format_string, 'a', 'a', 'a', 'a', 'a'));
  }
}

void ten_format_arguments(benchmark::State& state) {
  auto format_string = std::string();
  for (int i = 0; i < 10; ++i) {
    format_string += "{}" + std::string(state.range(0), 'x');
  }
  const auto test_output = fmt::format(format_string, 'a', 'a', 'a', 'a', 'a',
                                       'a', 'a', 'a', 'a', 'a');
  if (state.range(0) == 0 && test_output != std::string(10, 'a')) {
    throw std::runtime_error("wrong");
  }
  for (auto _ : state) {
    benchmark::DoNotOptimize(fmt::format(format_string, 'a', 'a', 'a', 'a', 'a',
                                         'a', 'a', 'a', 'a', 'a'));
  }
}

BENCHMARK(one_format_argument)->Arg(0)->Arg(10)->Arg(100)->Arg(1000);
BENCHMARK(five_format_arguments)->Arg(0)->Arg(10)->Arg(100)->Arg(1000);
BENCHMARK(ten_format_arguments)->Arg(0)->Arg(10)->Arg(100)->Arg(1000);

BENCHMARK_MAIN();

vitaut · 2021-03-13T15:21:30Z

Thank you both!

Update format.h

b3e6d01

optimize loop

moiwi changed the title ~~Update format.h~~ optimize append Mar 5, 2021

moiwi added 2 commits March 12, 2021 11:19

Merge branch 'master' into moiwi-opt

d08fd87

Update format.h

6276f1d

vitaut merged commit b8ff3c1 into fmtlib:master Mar 13, 2021

dependabot bot mentioned this pull request Mar 14, 2021

Bump external/fmt from bbd6ed5 to 6151d0d link-u/libavif-container#85

Closed

moiwi deleted the moiwi-opt branch December 25, 2021 14:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize append #2164

optimize append #2164

moiwi commented Mar 5, 2021

vitaut commented Mar 10, 2021

rimathia commented Mar 11, 2021 •

edited

Loading

vitaut commented Mar 13, 2021

optimize append #2164

optimize append #2164

Conversation

moiwi commented Mar 5, 2021

vitaut commented Mar 10, 2021

rimathia commented Mar 11, 2021 • edited Loading

vitaut commented Mar 13, 2021

rimathia commented Mar 11, 2021 •

edited

Loading