Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCC 14 optimizer doesn't like std::filesystem::path formatting (Wstringop-overflow) #4125

Closed
Blzut3 opened this issue Aug 19, 2024 · 8 comments

Comments

@Blzut3
Copy link

Blzut3 commented Aug 19, 2024

Starting with 11.0.2, when using GCC 14, particularly with link time optimization but also reproducible in some cases without, it seems that formatting std::filesystem::path (i.e. fmt::to_string(std::filesystem::path {...})) produces stringop-overflow warnings. While this can be reproduced by just calling fmt::to_string<std::filesystem::path> in exploring I found that explicitly instantiating two particular templates seems to be enough to trigger it (one or the other is not enough). Using O3 optimizations seems to be required for this to trigger.

https://godbolt.org/z/Gsvoqz71e

I've traced the change causing this to commit f29a7e7. Specifically just the removal of the copy overload there. Reverting that overload removal is one possible workaround.

Despite the functions that are instantiated there, the actual function in question seems to be for_each_codepoint. Increasing the size of buf by up to 8 characters eliminates the warnings. I've also had success in getting rid of them by storing the start and end pointers for the string instead and removing the if (s.size() >= block_size) check to simplify the flow (might have needed an assume somewhere in there too). Ultimately this feels like a false positive to me, and I can report to GCC if agreed.

I've tried various combinations of [[assume()]] without changing the code structure with no avail, so GCC seems particularly stubborn about this one.

In file included from /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:41,
                 from /opt/compiler-explorer/libs/fmt/trunk/include/fmt/std.h:11,
                 from <source>:1:
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 8 bytes into a region of size 7 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 16 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 16 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 17 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [1, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 17 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 18 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [2, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 18 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 19 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [3, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 19 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 20 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [4, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 20 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 21 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [5, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 21 into destination object 'buf' of size 7
In function 'constexpr OutputIt fmt::v11::detail::copy(InputIt, InputIt, OutputIt) [with T = char; InputIt = const char*; OutputIt = char*; typename std::enable_if<(! is_back_insert_iterator<OutputIt>::value), int>::type <anonymous> = 0]',
    inlined from 'constexpr void fmt::v11::detail::for_each_codepoint(fmt::v11::string_view, F) [with F = find_escape(const char*, const char*)::<lambda(uint32_t, fmt::v11::string_view)>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:678:15,
    inlined from 'fmt::v11::detail::find_escape_result<char> fmt::v11::detail::find_escape(const char*, const char*)' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1849:21,
    inlined from 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]' at /opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:1941:30:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/base.h:1220:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1220 |   while (begin != end) *out++ = static_cast<T>(*begin++);
      |                        ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h: In function 'OutputIt fmt::v11::detail::write_escaped_string(OutputIt, fmt::v11::basic_string_view<Char>) [with Char = char; OutputIt = fmt::v11::basic_appender<char>]':
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 22 into destination object 'buf' of size 7
  677 |     char buf[2 * block_size - 1] = {};
      |          ^~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset [6, 7] into destination object 'buf' of size 7
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:677:10: note: at offset 22 into destination object 'buf' of size 7
@macdew
Copy link

macdew commented Aug 21, 2024

I'm trying to upgrade my project to fmt 11 and noticed the same gcc warning

@sunmy2019
Copy link
Contributor

Looks like a compiler bug

@sunmy2019
Copy link
Contributor

sunmy2019 commented Aug 24, 2024

I investigated it.

The difference here is GCC14 does one more inline. You can mimic the GCC14 behavior in GCC13 by adding
[[gnu::always_inline]] attribute to write_escaped_cp function.

With always inlining write_escaped_cp function, this problem dates back to GCC11.

@sunmy2019
Copy link
Contributor

So I inlined all those functions manually. Found it keeps warning the following code snippets.

https://godbolt.org/z/Y3E5fh9G6

   if (num_chars_left > 0 && num_chars_left < block_size) {

      char buf[2 * block_size - 1] = {};

      {
        auto _begin = p;
        auto _out = buf;

        while (_begin < p + num_chars_left) *_out++ = *_begin++; // warns here.
      }
Full Code Snippet

#include <fmt/format.h>

namespace fmt {
inline namespace v11 {
namespace detail {

auto write_escaped_string(basic_appender<char> out,
                          const basic_string_view<char> str) {
  *out++ = '"';

  const auto str_data = str.data();
  const auto str_size = str.size();

  auto begin = str_data, end = str_data + str_size;

  do {
    find_escape_result<char> escape;

    auto decode = [&](const char* buf_ptr, const char* ptr) -> const char* {
      auto cp = uint32_t();
      auto error = 0;
      auto end = utf8_decode(buf_ptr, &cp, &error);

      if (needs_escape(error ? invalid_code_point : cp)) {
        auto sv = string_view(ptr, error ? 1 : to_unsigned(end - buf_ptr));
        escape = {sv.begin(), sv.end(), cp};
        return nullptr;
      }
      return error ? buf_ptr + 1 : end;
    };

    auto p = begin;
    const size_t block_size = 4;  // utf8_decode always reads blocks of 4 chars.

    if (p <= end - block_size) {
      for (; p <= end - block_size;) {
        p = decode(p, p);
        if (!p) goto escape_found;
      }
    }

    // invariant:
    // 1. p >= end - block_size + 1
    // 2. end - p < block_size

    if (const auto num_chars_left = end - p;
        num_chars_left > 0 && num_chars_left < block_size) {
      // invariant: num_chars_left < block_size

      char buf[2 * block_size - 1] = {};

      {
        auto _begin = p;
        auto _out = buf;

        while (_begin < p + num_chars_left) *_out++ = *_begin++;
      }

      const char* buf_ptr = buf;
      do {
        auto end2 = decode(buf_ptr, p);
        if (!end2) goto escape_found;
        p += end2 - buf_ptr;
        buf_ptr = end2;
      } while (buf_ptr - buf < num_chars_left);
    }

  escape_found:

    out = copy<char>(begin, escape.begin, out);

    begin = escape.end;

    if (!begin) break;

    auto c = escape.cp;
    switch (escape.cp) {
    case '\n':
      *out++ = '\\';
      c = 'n';
      break;
    case '\r':
      *out++ = '\\';
      c = 'r';
      break;
    case '\t':
      *out++ = '\\';
      c = 't';
      break;
    case '"':
      FMT_FALLTHROUGH;
    case '\'':
      FMT_FALLTHROUGH;
    case '\\':
      *out++ = '\\';
      break;
    default:
      if (escape.cp < 0x100) {
        out = write_codepoint<2, char>(out, 'x', escape.cp);
        continue;
      }
      if (escape.cp < 0x10000) {
        out = write_codepoint<4, char>(out, 'u', escape.cp);
        continue;
      }
      if (escape.cp < 0x110000) {
        out = write_codepoint<8, char>(out, 'U', escape.cp);
        continue;
      }

      for (char escape_char : basic_string_view<char>(
               escape.begin, to_unsigned(escape.end - escape.begin))) {
        out = write_codepoint<2, char>(
            out, 'x', static_cast<uint32_t>(escape_char) & 0xFF);
      }
      continue;
    }
    *out++ = c;

  } while (begin != end);

  *out++ = '"';
  return out;
}

}  // namespace detail
}  // namespace v11
}  // namespace fmt

@sunmy2019
Copy link
Contributor

I carefully reads through the logic and decide it should be a false positive. Maybe we will need a hack to suppress that compiler warning.

@vitaut
Copy link
Contributor

vitaut commented Aug 24, 2024

It does look like false positive so might be worth reporting to gcc but I applied a workaround in 0379bf3.

@vitaut vitaut closed this as completed Aug 24, 2024
@sunmy2019
Copy link
Contributor

It does look like false positive so might be worth reporting to gcc but I applied a workaround in 0379bf3.

Yeah, but it's a temporary workaround. The same warning happens if write_escaped_cp is always inlined.

msimberg added a commit to msimberg/pika that referenced this issue Aug 26, 2024
msimberg added a commit to msimberg/pika that referenced this issue Aug 27, 2024
@vitaut
Copy link
Contributor

vitaut commented Aug 28, 2024

The same warning happens if write_escaped_cp is always inlined.

If you have a better solution, a PR would be welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants