CI: add a test that builds with clang14 #1498

lgritz · 2022-05-02T06:51:19Z

A few minor changes were needed to address new warnings.

clang14 warns if you use bitwise | or & in an if, but there are
some places where we do that purposely to avoid branches.

Also some batch shading related vectorized loops fail to vectorize
with clang14 -- though they worked fine with clang12 -- and so we
needed to mark those as "complex" SIMD loops. Is this a regression
with clang14?

Signed-off-by: Larry Gritz [email protected]

AlexMWells · 2022-05-02T18:17:40Z

src/liboslexec/batched_analysis.cpp

@@ -106,6 +106,10 @@ namespace  // Unnamed
 bool
 are_op_results_always_implicitly_varying(ustring opname)
 {
+    OIIO_PRAGMA_WARNING_PUSH
+#if OIIO_CLANG_VERSION >= 140000


This feels pretty noisy in the source, why not just disable it via the command line flag -Wno-bitwise-instead-of-logical?
Every use of | or & should have been an explicit decision.

Yeah, I could do that I suppose.

I wasn't sure of the merits of disabling the warning for the whole codebase, which of course would have the effect of NOT catching any times we accidentally used bitwise ops when we meant logical. That warning, for the vast majority of code, is not a bad idea. I think it serves a good purpose.

On the other hand, we never had the warning before, and now that we've turned it on, the only places it caught were instances where we purposely used the bitwise ops. So maybe it's trying to catch a mistake that we never make in practice, so it's always a false positive.

Let me mull this over (@fpsunflower or others, do you have an opinion?) but I think you have me leaning toward thinking that you are right and I should just disable the warning wholesale.

As the warning it self doesn't can't actually determine if you intended to skip the short-circuit behavior of logical || and &&, IMO it shouldn't be on by default. If tracking down a bug, maybe it would be nice to get such a report of all places bitwise was used to verify they were all intentional, but certainly not as part of a -Werror build configuration.
As a typical use is
if (ptr && ptr->foobar())
If that were just a bitwise &, then both sides will execute and the order is undefined:
if (ptr & ptr->foobar())
then a crash might occur. However, C++ has made it 37 years without this warning,
mostly because the call stack should lead them right to the offending line. However if not a simple invalid pointer some incorrect behavior of the "order undefined" and "both sides" execute behavior could happen.

The warning itself actually says
note: cast one or both operands to int to silence this warning
Cast to an int? But it was already a bool and adding additional casts just to make this warning happy doesn't make sense. Bitwise | and & on bools is legitimate, and not a cause for warning. Maybe there should be a -Wlook-for-common-typos flag that does this (and others like assignment inside if statement).

For performance code we often want to skip the short circuit and if an additional branch is generated is a performance degradation, so there might be a trade off between warning for a "potential" typo and performance.

Take a look at example
https://godbolt.org/z/bnKMKnn5K
We can see a jmp in the assembly, change MY_OR at the top to bitwise | and it goes away. Branch prediction penalties and additional management of SIMD masks and mask registers are real costs. So developer should choose || or | depending on their needs.

I'm usually in favor of adopting all warnings, but this one doesn't seem to be fully thought out. I'm fine disabling this one for now.

AlexMWells · 2022-05-02T18:22:47Z

src/liboslexec/batched_llvm_gen.cpp

@@ -5933,7 +5933,7 @@ LLVMGEN (llvm_gen_gettextureinfo)
    FuncSpec func_spec("get_textureinfo_uniform");
    const char * funcName = rop.build_name(func_spec);

-    bool requires_binning = (!Filename.is_uniform()) | (!Dataname.is_uniform()) | use_coords;
+    bool requires_binning = (!Filename.is_uniform()) || (!Dataname.is_uniform()) || use_coords;


These were all changed to bitwise from logical to avoid generating branching code. As this is during compilation, probably not measurable impact. But if we disable the check with "-Wno-bitwise-instead-of-logical" then these files would not even need to be in this pull request.

I put the guards (i.e. kept the bitwise ops) in all code paths that are called during shader execution or that are in code we are trying to vectorize.

There were a few instances -- including this one, which is where we're generating IR -- where the code is not trying to be vectorized, not called by the shaders themselves, and not especially performance sensitive. And in those circumstances, I reverted to the logical operations because that's the real meaning of this statement and there is nothing to gain from eliminating a single branch here.

I also looked at godbolt and see that whether && and || are turned into branchless & and | automatically by the C++ optimizer depends heavily on the specific compiler and version. Modern icc and clang are pretty good at it in many circumstances. gcc is not. So I think the use of bitwise ops is only important in areas that are extremely performance sensitive and we don't want to take any chances that the compiler will fail to optimize away the branch.

Agreed that the change for batched_llvm_gen.cpp and other places isn't performance critical. And that change is fine. Just noting as they were all bools it is somewhat equivalent when the compiler does automatically turn them into & and |, kinda shows that they are valid to be & and | to begin with.

Also a compiler "might" turn && and || are turned into branchless & and | automatically. Turning optimization off though probably won't turn && and || are turned into branchless & and |. So one might prefer to never generate the branch in the first place, to better ensure similar execution during debug. Also the example I posted, clearly a jmp is still generated with the || at O3, because it is contractually obligated to by language rules around short-circuit. And Clang might be good and that automatic conversion if it can prove no side effects (which is good feature), but other compilers or optimization levels may not be and still generate jmps.

I've updated the PR to revert all the individual line-by-line pragmas in favor of just disabling this warning for all of liboslexec.

AlexMWells · 2022-05-02T19:30:38Z

src/liboslexec/wide/wide_opcolor.cpp

@@ -309,13 +309,13 @@ wide_transformc (ColorSystem cs, StringParam fromspace, StringParam tospace,
    Wide<COLOR> wCrgb(bCrgb);
    if (fromspace == STRING_PARAMS(RGB) || fromspace == STRING_PARAMS(rgb)
         || fromspace == STRING_PARAMS(linear) || fromspace == cs.colorspace()) {
-        OSL_OMP_PRAGMA(omp simd simdlen(__OSL_WIDTH))
+        OSL_OMP_COMPLEX_SIMD_LOOP(simdlen(__OSL_WIDTH))


To not loose vectorization for clang 13 and below, you might want to conditionally use

#if OSL_CLANG_VERSION > 140000 OSL_OMP_COMPLEX_SIMD_LOOP(simdlen(__OSL_WIDTH)) #else OSL_OMP_PRAGMA(omp simd simdlen(__OSL_WIDTH)) #endif

This does appear to be a regression in Clang14 of some kind. Its odd that these color operations, especially this super simple assignment loop, are having issues (where as much more complex noises and spline operations continue to vectorize). So I suspect it has something to do with the underlying Color3 class itself, of the organization/optimization of wide_transformc. Given that other functions in wide_opcolor.cpp are vectorizing fine using Color3, that leads me to think its wide_transformc itself.
I don't have an easy way to reproduce without moving to clang14 myself.

Some experiments to try:

Change ColorSystem parameters to be const & vs. copy by value (this looks like it might be an existing issue):

wide_transformc (ColorSystem cs, ...

to

wide_transformc (const ColorSystem &cs, ...

SIMD loop isolation. I have seen compilers try to do loop fusion and other magic that messes up otherwise valid vectorizing loops. We do this in wide_opmatrix.cpp utilizing a simple noinline helper functor called invoke. This "isolation" technique can also help with register allocation by limiting the scope of what the optimizer has to deal with in a single function:

namespace { // invoke is helper to ensure a functor is compiled inside a // actual function call vs. inlining. // Useful for keeping the slow path from fusing // with a streamlined SIMD loop or influencing // register usage template<typename FunctorT> OSL_NOINLINE void invoke(FunctorT f); template<typename FunctorT> void invoke(FunctorT f) { f(); } }

then inside of wide_transformc wrap each simd loop in its own invoke call with a lambda expression

invoke([=]() -> void { OSL_OMP_PRAGMA(omp simd simdlen(__OSL_WIDTH)) for (int lane = 0; lane < __OSL_WIDTH; ++lane) { COLOR C = wInput[lane]; wCrgb[lane] = C; } });

for the loops that use the ColorSystem argument cs might want to add that as a reference in the lamba capture clause.

invoke([=,&cs]() -> void {

AlexMWells

LGTM.

Once merged, might want to come back and experiment to see if vectorization can be restored for wide_opcolor.cpp with clang14.

A few minor changes were needed to address new warnings. clang14 warns if you use bitwise `|` or `&` in an `if`, but there are some places where we do that purposely to avoid branches. Also some batch shading related vectorized loops fail to vectorize with clang14 -- though they worked fine with clang12 -- and so we needed to mark those as "complex" SIMD loops. Is this a regression with clang14? Signed-off-by: Larry Gritz <[email protected]>

Revert the invididual line-by-line pragmas, do it for the whole library. Signed-off-by: Larry Gritz <[email protected]>

Signed-off-by: Larry Gritz <[email protected]>

AlexMWells reviewed May 2, 2022

View reviewed changes

AlexMWells approved these changes May 4, 2022

View reviewed changes

lgritz added 3 commits May 3, 2022 21:22

Disable clang14 warnings about bitwise ops for all of liboslexec

732ef86

Revert the invididual line-by-line pragmas, do it for the whole library. Signed-off-by: Larry Gritz <[email protected]>

Revise simd loop pragma for wide_transformc due to llvm14 regression

03d94ad

Signed-off-by: Larry Gritz <[email protected]>

lgritz force-pushed the lg-clang14 branch from 01c5116 to 03d94ad Compare May 4, 2022 04:22

Updated INSTALL requirements

f79023c

Signed-off-by: Larry Gritz <[email protected]>

lgritz merged commit 10b7264 into AcademySoftwareFoundation:main May 4, 2022

lgritz deleted the lg-clang14 branch May 4, 2022 06:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: add a test that builds with clang14 #1498

CI: add a test that builds with clang14 #1498

lgritz commented May 2, 2022

AlexMWells May 2, 2022

lgritz May 2, 2022 •

edited

Loading

AlexMWells May 2, 2022

fpsunflower May 2, 2022

AlexMWells May 2, 2022

lgritz May 2, 2022 •

edited

Loading

AlexMWells May 2, 2022

lgritz May 3, 2022

AlexMWells May 2, 2022

AlexMWells left a comment •

edited

Loading

CI: add a test that builds with clang14 #1498

CI: add a test that builds with clang14 #1498

Conversation

lgritz commented May 2, 2022

AlexMWells May 2, 2022

Choose a reason for hiding this comment

lgritz May 2, 2022 • edited Loading

Choose a reason for hiding this comment

AlexMWells May 2, 2022

Choose a reason for hiding this comment

fpsunflower May 2, 2022

Choose a reason for hiding this comment

AlexMWells May 2, 2022

Choose a reason for hiding this comment

lgritz May 2, 2022 • edited Loading

Choose a reason for hiding this comment

AlexMWells May 2, 2022

Choose a reason for hiding this comment

lgritz May 3, 2022

Choose a reason for hiding this comment

AlexMWells May 2, 2022

Choose a reason for hiding this comment

AlexMWells left a comment • edited Loading

Choose a reason for hiding this comment

lgritz May 2, 2022 •

edited

Loading

lgritz May 2, 2022 •

edited

Loading

AlexMWells left a comment •

edited

Loading