Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] Segmentation fault (possibly due to ODR violation) when mixing format.h with and without ostream.h in different cpp files #2357

Closed
devgs opened this issue Jun 14, 2021 · 9 comments

Comments

@devgs
Copy link

devgs commented Jun 14, 2021

I won't speculate, it's just a hypothesis that this is an ODR violation. Maybe it's something else. Maybe library makes a static check and interprets the packaged argument one way or another, depending on static conditions, that can result in two different outcomes: depending on whether fmt/ostream.h is included.

I haven't dug too deep into the library. I've just stoppen on a PoC to reproduce the issue. The issue stems from a fact that some class Foo has both: (1) conversion to std::string_view and (2) operator << (std::ostream&).

The PoC

foo.hpp

#include <string>
#include <string_view>
#include <iostream>

struct Foo
{
    explicit operator std::string_view() const {
        return _data;
    }

    std::string _data{"foo"};
};

inline std::ostream & operator << (std::ostream & stream, const Foo & foo)
{
    return stream.write(foo._data.data(), foo._data.size());
}

test.cpp

#include "foo.hpp"

#include <fmt/format.h>
// By accident, 'ostream.h' is not included
// #include <fmt/ostream.h>

void test()
{
    fmt::format("test {}", Foo{});
}

main.cpp

#include "foo.hpp"

#include <fmt/format.h>
#include <fmt/ostream.h>

// Just a declaration of function from test.cpp
void test();

int main()
{
    fmt::format("test {}", Foo{});
    // Uncomment to force Segfault no matter the compilation order.
    //test();
    return 0;
}

Compile and run

Note: The order of cpp files is important here. We want to instantiate test.cpp first and then setfault in main.cpp. Though it will work both ways, granted that both usages are invoked (uncomment call to test(); in main.cpp), one of the will always fail.

FreeBSD

clang --std=c++17 main.cpp test.cpp -I /usr/local/include -L /usr/local/lib -stdlib=libc++ -lc++ -lm -lfmt -o poc && ./poc

Linux

clang --std=c++17 test.cpp main.cpp -I /usr/local/include -L /usr/local/lib -lstdc++ -lm -lfmt -o poc && ./poc

The issue is pretty critical, given how hard it is to track all of the possible places where format() can accept argument by conversion to std::string_view, without a prior include of <fmt/ostream.h>.

Update:

Forgot to add that this issue only exists in v7.*. We were successfully using v6.* without any issues at all. Only noticed it after the upgrade.

@sunmy2019
Copy link
Contributor

sunmy2019 commented Jun 18, 2021

(The following analysis only work for current trunk (427b534). I did not look into the older versions.)

You should always include <fmt/ostream.h> if you want to use operator<<.

If you don't, it will never generate the code calling operator<<.

In your case,

fmt::format("test {}", Foo{});

In main.cpp, the generated function using operator<<.
In the test.cpp, the generated function is converting Foo to string_view then output.


In case you are curious about what causes segfault, just like I am, I have spent several hours digging into it.

The two generated fmt::format have different make_format_args.

The make_format_args which uses operator<< stores the address of format hander (a function address!) in the arg. The other make_format_args stores the string's size in the exact same memory location.

If you use both, the linker will throw away one make_format_args, it definitely will be a segfault if the size was interpreted as an function address.

If you only use one, you should pray the linker for not throwing away the correct make_format_args.

@devgs
Copy link
Author

devgs commented Jun 18, 2021

@sunmy2019
Welp, actually I don't want to use operator<<. In my real-world case, It was there for other types that don't have conversion to std::string_view. And by simply adding this include my program started to crash. In PoC it's presented for clarity: one of the files includes it, the other - doesn't.

What I'm saying is this is a HUGE pitfall that should be addressed. Anyone unsuspecting can step into it. Library should be implemented in a way that avoids this and not crash your program depending on what you include or not. But i think, it's already obvious and hope you're already planning how to fix this.

@sunmy2019
Copy link
Contributor

This HUGE pitfall should only happen if one class/struct has both an operator<< and an conversion to default formattable type (like string_view/int/float etc) Is it your case?

If you mark the conversion operator explicit, only a few situations will let the conversion happen, including the string_view in your case.

I don't know how to fix. I am not in charge here. Maybe include the <ostream.h> by default?


For internals:

The conversion to default formattable type is controlled by

fmt/include/fmt/core.h

Lines 1213 to 1352 in 427b534

template <typename Context> struct arg_mapper {
using char_type = typename Context::char_type;
FMT_CONSTEXPR FMT_INLINE auto map(signed char val) -> int { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(unsigned char val) -> unsigned {
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(short val) -> int { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(unsigned short val) -> unsigned {
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(int val) -> int { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(unsigned val) -> unsigned { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(long val) -> long_type { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(unsigned long val) -> ulong_type {
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(long long val) -> long long { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(unsigned long long val)
-> unsigned long long {
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(int128_t val) -> int128_t { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(uint128_t val) -> uint128_t { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(bool val) -> bool { return val; }
template <typename T, FMT_ENABLE_IF(is_char<T>::value)>
FMT_CONSTEXPR FMT_INLINE auto map(T val) -> char_type {
static_assert(
std::is_same<T, char>::value || std::is_same<T, char_type>::value,
"mixing character types is disallowed");
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(float val) -> float { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(double val) -> double { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(long double val) -> long double {
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(char_type* val) -> const char_type* {
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(const char_type* val) -> const char_type* {
return val;
}
template <typename T, FMT_ENABLE_IF(is_string<T>::value)>
FMT_CONSTEXPR FMT_INLINE auto map(const T& val)
-> basic_string_view<char_type> {
static_assert(std::is_same<char_type, char_t<T>>::value,
"mixing character types is disallowed");
return to_string_view(val);
}
template <typename T,
FMT_ENABLE_IF(
std::is_constructible<basic_string_view<char_type>, T>::value &&
!is_string<T>::value && !has_formatter<T, Context>::value &&
!has_fallback_formatter<T, char_type>::value)>
FMT_CONSTEXPR FMT_INLINE auto map(const T& val)
-> basic_string_view<char_type> {
return basic_string_view<char_type>(val);
}
template <
typename T,
FMT_ENABLE_IF(
std::is_constructible<std_string_view<char_type>, T>::value &&
!std::is_constructible<basic_string_view<char_type>, T>::value &&
!is_string<T>::value && !has_formatter<T, Context>::value &&
!has_fallback_formatter<T, char_type>::value)>
FMT_CONSTEXPR FMT_INLINE auto map(const T& val)
-> basic_string_view<char_type> {
return std_string_view<char_type>(val);
}
FMT_CONSTEXPR FMT_INLINE auto map(const signed char* val) -> const char* {
static_assert(std::is_same<char_type, char>::value, "invalid string type");
return reinterpret_cast<const char*>(val);
}
FMT_CONSTEXPR FMT_INLINE auto map(const unsigned char* val) -> const char* {
static_assert(std::is_same<char_type, char>::value, "invalid string type");
return reinterpret_cast<const char*>(val);
}
FMT_CONSTEXPR FMT_INLINE auto map(signed char* val) -> const char* {
const auto* const_val = val;
return map(const_val);
}
FMT_CONSTEXPR FMT_INLINE auto map(unsigned char* val) -> const char* {
const auto* const_val = val;
return map(const_val);
}
FMT_CONSTEXPR FMT_INLINE auto map(void* val) -> const void* { return val; }
FMT_CONSTEXPR FMT_INLINE auto map(const void* val) -> const void* {
return val;
}
FMT_CONSTEXPR FMT_INLINE auto map(std::nullptr_t val) -> const void* {
return val;
}
// We use SFINAE instead of a const T* parameter to avoid conflicting with
// the C array overload.
template <typename T>
FMT_CONSTEXPR auto map(T) -> enable_if_t<std::is_pointer<T>::value, int> {
// Formatting of arbitrary pointers is disallowed. If you want to output
// a pointer cast it to "void *" or "const void *". In particular, this
// forbids formatting of "[const] volatile char *" which is printed as bool
// by iostreams.
static_assert(!sizeof(T), "formatting of non-void pointers is disallowed");
return 0;
}
template <typename T, std::size_t N>
FMT_CONSTEXPR FMT_INLINE auto map(const T (&values)[N]) -> const T (&)[N] {
return values;
}
template <typename T,
FMT_ENABLE_IF(std::is_enum<T>::value &&
!has_formatter<T, Context>::value &&
!has_fallback_formatter<T, char_type>::value)>
FMT_CONSTEXPR FMT_INLINE auto map(const T& val)
-> decltype(std::declval<arg_mapper>().map(
static_cast<typename std::underlying_type<T>::type>(val))) {
return map(static_cast<typename std::underlying_type<T>::type>(val));
}
template <typename T,
FMT_ENABLE_IF(!is_string<T>::value && !is_char<T>::value &&
(has_formatter<T, Context>::value ||
has_fallback_formatter<T, char_type>::value))>
FMT_CONSTEXPR FMT_INLINE auto map(const T& val) -> const T& {
return val;
}
template <typename T, FMT_ENABLE_IF(is_named_arg<T>::value)>
FMT_CONSTEXPR FMT_INLINE auto map(const T& named_arg)
-> decltype(std::declval<arg_mapper>().map(named_arg.value)) {
return map(named_arg.value);
}
auto map(...) -> unformattable { return {}; }
};

Here we use function overload to decide what to convert. If the conversion operator is marked explicit, only those types with exact match (here means a template) could happen.

@devgs
Copy link
Author

devgs commented Jun 18, 2021

This HUGE pitfall should only happen if one class/struct has both an operator<< and an conversion to default formattable type (like string_view/int/float etc) Is it your case?

You can clearly see this from my PoC. Yes, it is my case.

If you mark the conversion operator explicit, only a few situations will let the conversion happen, including the string_view in your case.

You can see from the PoC that it is explicit.

I don't know how to fix. I am not in charge here. Maybe include the <ostream.h> by default?

Okay then. Then I'm more confused than before. Why do you so confidently suggest what I should do and what is/is-not a HUGE pitfall, if you're not the maintainer of this library? Isn't it up to them to counter my argument?

Here we use function overload to decide what to convert. If the conversion operator is marked explicit, only those types with exact match (here means a template) could happen.

Again, as you can see from PoC, it is explicit. FYI std::is_constructible returns true event in case of explicit conversion operator. You've probably though about std::is_convertible https://godbolt.org/z/39KqvYf7Y. But no, library uses std::is_constructible, so explicit won't help.

@devgs
Copy link
Author

devgs commented Jun 18, 2021

Maybe include the <ostream.h> by default?

Sorry, but I couldn't stop myself from writing this.

Imagine a hypothetical scenario. You bought a new coffee machine. It's super convenient, the coffee is perfect, but then there's one thing that bothers you. Among the myriad of buttons there's one special. A really small one, located somewhere on a side. When you press it (hopefully by a accident) the coffee machine spays a hot steam in your face. You quickly open a manual, read through it and yeah, found a small script at a bottom of one of the pages: Maybe you should avoid pressing this button by default? 😄

Your words sound to me exactly like this hypothetical scenario.

@vitaut
Copy link
Contributor

vitaut commented Jun 23, 2021

Thanks for reporting. This is indeed problematic behavior and we should fix or at least diagnose it.

@devgs
Copy link
Author

devgs commented Jan 13, 2022

Any hope for a progress on this in the upcoming releases?

@vitaut
Copy link
Contributor

vitaut commented Jan 13, 2022

I plan to look into this issue but don't have any specific timeline yet.

@vitaut
Copy link
Contributor

vitaut commented Feb 4, 2022

Fixed in f055ebb. Including fmt/ostream.h no longer magically changes formatting behavior and an explicit opt in is required instead, e.g.

template <> struct fmt::formatter<Foo> : ostream_formatter {};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants