-
Couldn't load subscription status.
- Fork 15k
[AArch64] Improve host feature detection. #160410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
SVE depends on a combination of host support and operating system support. Sometimes those don't line up with detected host CPU name; make sure SVE is disabled when it isn't available. Implement this for both Windows and Linux. (We don't have a codepath for other operating systems. If someone wants to implement this, it should be possible to adapt fmv code from compiler-rt.) While I'm here, also add support for detecting other Windows CPU features. For Windows, declare constants ourselves so the code builds on older SDKs; we also do this in compiler-rt.
13bdb5f to
fc5d883
Compare
|
On my Surface Pro 11 with a Qualcomm Snapdragon X I get: |
|
Is that the same as what you get without the patch? I messed up the feature string for jscvt; I'll push a fix. (On a side-note, I just downloaded the newest SDK, and there are a bunch of new feature flags defined; we probably want to look at that at some point... but probably not in this patch.) |
|
Before the patch: Looks like the patch adds the following target feature args: |
|
Does not appear to have fixed feature detection on Linux or Android. I'm not sure you're actually doing a feature check? I know cpuinfo does not have sve: |
|
If the code inside the if statement is reached, it should disable SVE, I think. There are a few different ways we might not get there, I guess:
|
|
@efriedma-quic Sorry for the delay. So, it did actually fix the ORC JIT example I gave when originally reporting the issue! However, clang itself is still improperly enabling it. As per above: llvm-project-21.1.3.src/llvm/lib/TargetParser/Host.cpp:1505 GDB CPU_INFO Content dumpAs far as I can tell, it never even calls getHostCPUFeatures. I set line and function based breakpoints, and getProcCpuinfoContent was the only one that ever triggered. I stepped through a fair amount of the code as well. GDB getProcCpuinfoContent Breakpoint and getHostCPUFeatures TracebackA quick look through it all, and I'd guess the relevant functions in src/clang/lib/Driver/ToolChains/Arch/AArch64.cpp do not do any checking and completely ignore most of the information they get back. addDefaultExts just looks at the CPU name, and adds the default extensions that the detected arch name claims to support; it doesn't seem to check/override with negative options when /proc/cpuinfo contradicts it, etc. llvm/lib/Target/AArch64/AArch64Features.td I did not do any further looking there, but I would not be surprised if other related issues could exist. |
|
So you're saying the SVE detection is working... but we're not calling this code from clang? Okay, that's something we can look into as a followup. There's an existing issue report for -march=native: #149370. There isn't any open issue for -mcpu=native, though. |
|
Correct, at least for whatever path ORC JIT takes, it properly does the runtime check, and it works. And correct, when setting -march=native or -mcpu=native, Clang's compilation pathway appears to use predefined feature sets in llvm/lib/Target/AArch64/AArch64Features.td, and only uses /proc/cpuinfo for the CPU's name, I guess. Both print out Or cause a SIGILL: clang -o example example.c -mcpu=native -O3 -ffp-model=fast#include <stdio.h>
int main(void) {
int n = 4096;
float a[n];
float b[n];
int indices[n];
for (int i = 0; i < n; i++) {
a[i] = (float)i;
b[i] = 0.0f;
indices[i] = (n - 1) - i;
}
for (int i = 0; i < n; i++) {
b[i] = a[indices[i]] * 2.0f;
}
printf("b[0] = %f, b[%d] = %f\n", b[0], n - 1, b[n - 1]);
return 0;
} |
|
getAArch64ArchFeaturesFromMcpu in clang/lib/Driver/ToolChains/Arch/AArch64.cpp looks like it should call getHostCPUFeatures, but maybe that's not happening for some reason, or the result is getting ignored somehow. In any case, please open a separate bug for that. |
|
Is it not covered by #130509 ? |
Can you be more specific here and say what "line up" means. I think what you mean is that I could have a core that contains the SVE extension but be running a Linux kernel that does not support SVE. Purely matching the CPU name would tell you that SVE is allowed. (a core without SVE with a kernel that supports it should be fine, the kernel will not attempt to use it afaik) |
Yes. Or it could be fused off, or disabled by firmware. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linux part LGTM.
(Like you said I think in the long run we might want to take what hwcaps says and trust it for all features we know about. I believe that is how GCC handles it, but there might be some subtlety in the details of which to use in what way).
Not sure what you're asking, perhaps you mean we should mark this PR as fixing that issue? @efriedma-quic can tell you whether it does / partially does / will need more testing. |
| Features["aes"] = (crypto & Aes) == Aes; | ||
| Features["sha2"] = (crypto & Sha2) == Sha2; | ||
|
|
||
| // SVE support is disabled in for cores which are identified as supporting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment wording still needs fixing. "in for" does not sound right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you were thinking "disabled for cores that have sve disabled in them" and combined the two thoughts here.
SVE depends on a combination of host support and operating system support. Sometimes those don't line up with detected host CPU name; make sure SVE is disabled when it isn't available. Implement this for both Windows and Linux. (We don't have a codepath for other operating systems. If someone wants to implement this, it should be possible to adapt fmv code from compiler-rt.)
While I'm here, also add support for detecting other Windows CPU features.
For Windows, declare constants ourselves so the code builds on older SDKs; we also do this in compiler-rt.