-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add full support of format string parsing in compile-time API #2129
Add full support of format string parsing in compile-time API #2129
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for another high quality PR!
include/fmt/compile.h
Outdated
constexpr void on_error(const char* message) { throw format_error(message); } | ||
|
||
constexpr int on_arg_id() { | ||
throw format_error("handler cannot be used for empty arg_id"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for empty arg_id" -> "with automatic indexing"
Also can this be an assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it actually can be used for automatic indexing with named identifiers. Both runtime and (now) compile-time APIs keep automatic indexing when a named argument identifier is used. So it just cannot be used for an unnamed argument identifier in the automatic indexing mode, which this message is trying to say.
By the way, this function wouldn't be used in normal conditions because the code that invokes this handler actually controls that this handler is used only for numeric or named arguments. As long as it's true, no one would see this message, but when someone breaks the parsing code, they will get this message.
Also can this be an assert?
As I said, it just indicates an internal error, so the cause of this compile-time error can be everything not compile-time friendly. I saw several usages of throw format_error(...)
and use it too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what you mean by "unnamed argument identifier". Both "{}" and "{:...}" denote automatic indexing which is why I'm suggesting this minor wording change. It doesn't matter much since it's an internal error but a bit more consistent with the wording elsewhere.
it just indicates an internal error
Right and this is exactly why I'm suggesting to use an assert if possible. This will distinguish an internal error from a user error even though they both result in a compilation error. If assert doesn't work for some reason, then throw is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, "unnamed argument identifier" sounds a bit strange. 🙂
But the problem is probably in my wrong understanding of how named arguments work. After updating this PR (as I wrote here), this wording problem would be probably eliminated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
template <typename Char> struct parse_arg_id_result { | ||
arg_ref<Char> arg_id; | ||
const Char* arg_id_end; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass begin
by reference in parse_arg_id
and avoid introducing this struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... it would be a reference to the pointer, or (IMHO better) a pointer to the pointer, is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think reference is better unless it can be null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's probably impossible because there is a need to have arg_id_end
as a constexpr variable or, more importantly, begin
has to be a non-constexpr variable in that case, but it should be used in a constexpr context.
test/compile-test.cc
Outdated
struct test_custom_formattable {}; | ||
|
||
FMT_BEGIN_NAMESPACE | ||
template <> struct formatter<test_custom_formattable> { | ||
enum class output_type { two, four } type{output_type::two}; | ||
|
||
FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
auto it = ctx.begin(), end = ctx.end(); | ||
while (it != end && *it != '}') { | ||
++it; | ||
} | ||
auto spec = string_view(ctx.begin(), static_cast<size_t>(it - ctx.begin())); | ||
auto tag = string_view("custom"); | ||
if (spec.size() == tag.size()) { | ||
bool is_same = true; | ||
for (size_t index = 0; index < spec.size(); ++index) { | ||
if (spec[index] != tag[index]) { | ||
is_same = false; | ||
break; | ||
} | ||
} | ||
type = is_same ? output_type::four : output_type::two; | ||
} else { | ||
type = output_type::two; | ||
} | ||
return it; | ||
} | ||
|
||
template <typename FormatContext> | ||
auto format(const test_custom_formattable&, FormatContext& ctx) const | ||
-> decltype(ctx.out()) { | ||
return format_to(ctx.out(), type == output_type::two ? "{:>2}" : "{:>4}", | ||
42); | ||
} | ||
}; | ||
FMT_END_NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using one of the existing formatters such as duration formatter instead of introducing a new one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem here is that the chrono::duration
formatter is not ready to be used with compile-time API because of that format()
constness requirement. Should I update it in this PR or the separate one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I update it in this PR or the separate one?
This PR is OK since it should be a small change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with the weirdest looking format string from chrono-test
test/compile-test.cc
Outdated
FMT_BEGIN_NAMESPACE | ||
template <> struct formatter<test_dynamic_formattable> { | ||
size_t amount = 0; | ||
detail::arg_ref<char> width_refs[3]; | ||
|
||
FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
amount = static_cast<size_t>(*ctx.begin() - '0'); | ||
if (amount >= 1) { | ||
width_refs[0] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
} | ||
if (amount >= 2) { | ||
width_refs[1] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
} | ||
if (amount >= 3) { | ||
width_refs[2] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
} | ||
return ctx.begin() + 1; | ||
} | ||
|
||
template <typename FormatContext> | ||
auto format(const test_dynamic_formattable&, FormatContext& ctx) const | ||
-> decltype(ctx.out()) { | ||
int widths[3]{}; | ||
for (size_t i = 0; i < amount; ++i) { | ||
detail::handle_dynamic_spec<detail::width_checker>(widths[i], | ||
width_refs[i], ctx); | ||
} | ||
if (amount == 1) { | ||
return format_to(ctx.out(), "{:{}}", 41, widths[0]); | ||
} else if (amount == 2) { | ||
return format_to(ctx.out(), "{:{}}{:{}}", 41, widths[0], 42, widths[1]); | ||
} else if (amount == 3) { | ||
return format_to(ctx.out(), "{:{}}{:{}}{:{}}", 41, widths[0], 42, | ||
widths[1], 43, widths[2]); | ||
} else { | ||
throw format_error("formatting error"); | ||
} | ||
} | ||
}; | ||
FMT_END_NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. duration formatter has dynamic field support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the previous one (about custom formatter) and this are not the same.
Yes, it has dynamic field support. But as far as I can see, it supports the same set of nested replacement fields as the default formatter, {:{}.{}}
. So handling 2 dynamic fields for the default formatter would probably be enough to pass the test with chrono::duration
formatter.
While this custom formatter has a custom syntax for nested replacement fields (non {:{}.{}}
), and it has 3 of them. So handling default dynamic fields wouldn't be enough to pass the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to test the implementation of exotic formatter specializations here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with format string from chrono-test that uses dynamic specs
include/fmt/compile.h
Outdated
constexpr void on_error(const char* message) { throw format_error(message); } | ||
|
||
constexpr int on_arg_id() { | ||
throw format_error("handler cannot be used for empty arg_id"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what you mean by "unnamed argument identifier". Both "{}" and "{:...}" denote automatic indexing which is why I'm suggesting this minor wording change. It doesn't matter much since it's an internal error but a bit more consistent with the wording elsewhere.
it just indicates an internal error
Right and this is exactly why I'm suggesting to use an assert if possible. This will distinguish an internal error from a user error even though they both result in a compilation error. If assert doesn't work for some reason, then throw is OK.
template <typename Char> struct parse_arg_id_result { | ||
arg_ref<Char> arg_id; | ||
const Char* arg_id_end; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think reference is better unless it can be null.
test/compile-test.cc
Outdated
struct test_custom_formattable {}; | ||
|
||
FMT_BEGIN_NAMESPACE | ||
template <> struct formatter<test_custom_formattable> { | ||
enum class output_type { two, four } type{output_type::two}; | ||
|
||
FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
auto it = ctx.begin(), end = ctx.end(); | ||
while (it != end && *it != '}') { | ||
++it; | ||
} | ||
auto spec = string_view(ctx.begin(), static_cast<size_t>(it - ctx.begin())); | ||
auto tag = string_view("custom"); | ||
if (spec.size() == tag.size()) { | ||
bool is_same = true; | ||
for (size_t index = 0; index < spec.size(); ++index) { | ||
if (spec[index] != tag[index]) { | ||
is_same = false; | ||
break; | ||
} | ||
} | ||
type = is_same ? output_type::four : output_type::two; | ||
} else { | ||
type = output_type::two; | ||
} | ||
return it; | ||
} | ||
|
||
template <typename FormatContext> | ||
auto format(const test_custom_formattable&, FormatContext& ctx) const | ||
-> decltype(ctx.out()) { | ||
return format_to(ctx.out(), type == output_type::two ? "{:>2}" : "{:>4}", | ||
42); | ||
} | ||
}; | ||
FMT_END_NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I update it in this PR or the separate one?
This PR is OK since it should be a small change.
test/compile-test.cc
Outdated
FMT_BEGIN_NAMESPACE | ||
template <> struct formatter<test_dynamic_formattable> { | ||
size_t amount = 0; | ||
detail::arg_ref<char> width_refs[3]; | ||
|
||
FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) { | ||
amount = static_cast<size_t>(*ctx.begin() - '0'); | ||
if (amount >= 1) { | ||
width_refs[0] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
} | ||
if (amount >= 2) { | ||
width_refs[1] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
} | ||
if (amount >= 3) { | ||
width_refs[2] = detail::arg_ref<char>(ctx.next_arg_id()); | ||
} | ||
return ctx.begin() + 1; | ||
} | ||
|
||
template <typename FormatContext> | ||
auto format(const test_dynamic_formattable&, FormatContext& ctx) const | ||
-> decltype(ctx.out()) { | ||
int widths[3]{}; | ||
for (size_t i = 0; i < amount; ++i) { | ||
detail::handle_dynamic_spec<detail::width_checker>(widths[i], | ||
width_refs[i], ctx); | ||
} | ||
if (amount == 1) { | ||
return format_to(ctx.out(), "{:{}}", 41, widths[0]); | ||
} else if (amount == 2) { | ||
return format_to(ctx.out(), "{:{}}{:{}}", 41, widths[0], 42, widths[1]); | ||
} else if (amount == 3) { | ||
return format_to(ctx.out(), "{:{}}{:{}}{:{}}", 41, widths[0], 42, | ||
widths[1], 43, widths[2]); | ||
} else { | ||
throw format_error("formatting error"); | ||
} | ||
} | ||
}; | ||
FMT_END_NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to test the implementation of exotic formatter specializations here.
Actually, I'm going to make this PR a draft (yep, again 😄). |
…replacement fields
… instead of `throw`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor comments, otherwise looks good.
include/fmt/compile.h
Outdated
const T& arg = get<N>(args...); | ||
return write<Char>(out, arg); | ||
if constexpr (is_named_arg<typename std::remove_cv<T>::type>::value) { | ||
decltype(T::value) arg = get<N>(args...).value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think decltype(T::value)
can be replaced with a bit simpler const auto&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
include/fmt/compile.h
Outdated
if constexpr (str.size() == 2 && str[0] == '{' && str[1] == '}') | ||
return fmt::to_string(detail::first(args...)); | ||
if constexpr (str.size() == 2 && str[0] == '{' && str[1] == '}') { | ||
auto first = detail::first(args...); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may introduce an extra copy. Please use const auto&
instead of auto&
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Merged, thanks! |
Compile-time API functionality extended to support manual ordering and named arguments. Unlike my first attempt to do this in #2111, where I tried to use a format part array, here I'm just reusing that recursion of functions
compile_format_string()
andparse_tail()
.Some points for the changes:
{0}
and automatic indexing with{name}
work exactly as they work in the runtime APIstatic_assert
s fail with corresponding messagesunknown_format()
is returned from string compilation procedure, thus we fallback to the runtime API for this string (but this fallback is currently broken)