Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential performance regression of vformat_to in libfmt 11 compared to libfmt 10.2.1 #4070

Closed
odygrd opened this issue Jul 12, 2024 · 4 comments

Comments

@odygrd
Copy link

odygrd commented Jul 12, 2024

Hello,

I am experiencing a drop in throughput when migrating my logging library to libfmt 11 compared to libfmt 10.2.1.

I haven't been able to figure out yet if this is something related to my project or the newer version of libfmt, but the only change is the libfmt library.

Environment
Compiler: GCC 13
OS: Linux RHEL
Build Configuration: Release build with -O3 -DNDEBUG -std=gnu++20
libfmt: Header-only version

#include "quill/bundled/fmt/base.h"
#include "quill/bundled/fmt/format.h"

#include <string>
#include <iostream>
#include <chrono>
#include <atomic>

void ClobberMemory() {
  std::atomic_signal_fence(std::memory_order_acq_rel);
}

int main()
{
  using format_buffer_t = fmt::basic_memory_buffer<char>;

  fmt::string_view format_str = "Hello, {}. The answer is {} and {}.";

  int a = 1;
  int b = 2345;
  int c = 6789;
  auto args = fmt::make_format_args(a, b, c);

  const int n = 10'000'000;

  auto start = std::chrono::high_resolution_clock::now();

  for (int iteration = 0; iteration < n; ++iteration)
  {
    format_buffer_t buffer;
    fmt::vformat_to(std::back_inserter(buffer), format_str, args);
  }

  ClobberMemory();

  auto end = std::chrono::high_resolution_clock::now();

  std::chrono::duration<double> duration = end - start;
  double total_time = duration.count() * 1000;  // Convert to milliseconds

  std::cout << "Total time for formatting " << n << " strings: "
            << total_time << " ms" << std::endl;

  return 0;
}

I ran the above several times for both versions and it seems that version 10.2.1 is always faster. I am running it on an isolated cpu and nothing else is running on the whole box

image

Are there any known changes in libfmt 11 that could explain this difference?

@vitaut
Copy link
Contributor

vitaut commented Jul 12, 2024

There was a temporary regression in f2c55f6 that was mostly fixed in f348d1a but it looks like it wasn't completely fixed.

@vitaut
Copy link
Contributor

vitaut commented Jul 12, 2024

Basically we need to reintroduce the detection of back_insert_iterator<buffer> which was accidentally dropped.

@vitaut
Copy link
Contributor

vitaut commented Jul 13, 2024

Done in 33e7ed1.

Before:

Total time for formatting 10000000 strings: 664.418 ms

After:

Total time for formatting 10000000 strings: 618.701 ms

(best of three runs in each case)

This fixes most of the regression, will look into the remaining discrepancy.

@vitaut
Copy link
Contributor

vitaut commented Jul 14, 2024

The regression should be fixed as of f29a7e7.

10.2.1:

Total time for formatting 10000000 strings: 527.554 ms

f29a7e7:

Total time for formatting 10000000 strings: 525.269 ms

(again, best of 3 runs)

Thanks for catching this!

@vitaut vitaut closed this as completed Jul 14, 2024
mtremer pushed a commit to ipfire/ipfire-2.x that referenced this issue Aug 15, 2024
- Update from version 11.0.1 to 11.0.2
- Update of rootfile
- Changelog
    11.0.2
	- Fixed compatibility with non-POSIX systems
	  (fmtlib/fmt#4054,
	  fmtlib/fmt#4060).
	- Fixed performance regressions when using `std::back_insert_iterator` with
	  `fmt::format_to` (fmtlib/fmt#4070).
	- Fixed handling of `std::generator` and move-only iterators
	  (fmtlib/fmt#4053,
	  fmtlib/fmt#4057). Thanks @Arghnews.
	- Made `formatter<std::string_view>::parse` work with types convertible to
	  `std::string_view` (fmtlib/fmt#4036,
	  fmtlib/fmt#4055). Thanks @Arghnews.
	- Made `volatile void*` formattable
	  (fmtlib/fmt#4049,
	  fmtlib/fmt#4056). Thanks @Arghnews.
	- Made `Glib::ustring` not be confused with `std::string`
	  (fmtlib/fmt#4052).
	- Made `fmt::context` iterator compatible with STL algorithms that rely on
	  iterator category (fmtlib/fmt#4079).

Signed-off-by: Adolf Belka <[email protected]>
Signed-off-by: Michael Tremer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants