Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed builtin CLZ being used incorrectly #520

Closed
wants to merge 1 commit into from
Closed

Fixed builtin CLZ being used incorrectly #520

wants to merge 1 commit into from

Conversation

Cleroth
Copy link
Contributor

@Cleroth Cleroth commented Jun 16, 2017

Fixes #519

@vitaut
Copy link
Contributor

vitaut commented Jun 17, 2017

Thanks a lot for the fix but FMT_BUILTIN_CLZ* shouldn't be changed unconditionally. Moreover, the implementation is likely to be fixed in the future which will break the workaround. Perhaps it should be wrapped in #ifdef _MSC_VER:

#ifndef _MSC_VER
# if FMT_GCC_VERSION >= 400 || FMT_HAS_BUILTIN(__builtin_clz)
#  define FMT_BUILTIN_CLZ(n) __builtin_clz(n)
# endif
...
#endif

Will this work for you?

@Cleroth
Copy link
Contributor Author

Cleroth commented Jun 17, 2017

That would work, because it would avoid the broken code path...
I'm testing the result of __builtin_clz on different machines, and I'm getting the 'expected' value on one, and the 'inverted' value on the other. I think this actually may be due to the instruction not being supported on all CPUs.

According to this page:

It should be noted that on processors that do not support LZCNT, the instruction byte encoding is executed as BSR

And according to this:

LZCNT == (31 - BSR)

Thus, if you want to use LZCNT, you really need to check for whether it's supported. The compiler does not do this for you. See here (when you specify -march=native, clang will use lzcnt without checking for support, and so will GCC).

@vitaut
Copy link
Contributor

vitaut commented Jun 17, 2017

Fixed in 569c5bd, thanks!

Note that in https://godbolt.org/g/Ttiras the compiler emits the correct code for __builtin_clz both when using BSR and LZCNT, no manual adjustment is necessary. And it's not specific to LZCNT instruction, the same applies to SSE and similar. Whoever compiles the library should specify the correct target architecture.

As for this issue, it's definitely a bug in clang with Microsoft CodeGen because it violates the semantics of __builtin_clz.

@vitaut vitaut closed this Jun 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants