Work around xl compiler bug when nvcc preprocesses this file #2190
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I agree that my contributions are licensed under the {fmt} license, and agree to future changes to the licensing.
This works around a compiler bug in xlc when the file is preprocessed by nvcc. This manifests as
format_arg_store::desc
being undefined at link time. Pulling out thedetail::encode_types<Context, Args...>()
into it's own line was the fix.If you prefer to have this fixed in another way, I am open to suggestions.
More information here:
LLNL/axom#492
Write up about the minimal compiler reproducer by @joshessman-llnl:
So I was initially just looking to how minimally the behavior we were seeing with fmt could be reproduced, and was able to get to this:
The static member access through the
operator.
is a little unconventional but it's howfmt
uses thedesc
variable that was modified in this PR.I was still curious as to why it only failed with NVCC + XLC with
-x cu
and not with just XLC or NVCC + XLC not compiled as CUDA, so I started digging into the intermediate files. It looks like the CUDA C++ frontend (cudafec++
) parenthesizes thefoo.bar
:and this is what triggers the XLC bug. The parentheses are not meaningless here, I believe the type of
foo.bar
isint
while the type of (foo.bar
) isint&
- it's possible that the reference-ness is the cause of the bug. I would consider NVCC's changing ofint
->int&
to also be a bug or at least unexpected, thoughclang++
is able to handle this.After some further investigation it looks like XLC (run standalone, not through NVCC) only fails to link the
cudafe++
-parenthesized version when debug symbols are enabled (-g
) and with zero optimizations enabled (-O0
)In summary I think the XLC bug requires the following;
constexpr static
member with the result aconstexpr
function instead of a literalconstexpr static
member via an instance of the class instead ofX::s