-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for target details (CPUs and their supported features) #3927
Conversation
Question: would this allow specifying a custom CPU data layout? |
Why not more general types for CPUs and features? e.g. rather than: pub const HexagonCpu = enum {
Generic,
Hexagonv5,
Hexagonv55,
Hexagonv60,
Hexagonv62,
Hexagonv65,
Hexagonv66,
};
// ...
pub const MipsCpu = enum {
Mips1,
Mips2,
Mips3,
Mips32,
Mips32r2,
Mips32r3,
Mips32r5,
Mips32r6,
Mips4,
Mips5,
Mips64,
Mips64r2,
Mips64r3,
Mips64r5,
Mips64r6,
Octeon,
P5600,
};
// ...
pub const MipsFeature = enum {
Abs2008,
Crc,
Cnmips,
Dsp,
Dspr2,
Dspr3,
Eva,
Fp64,
// ...
...and then a bunch of reflection, instead something like: pub const Cpu = struct {
name: []const u8,
features: []const *const Feature,
};
pub const Feature = struct {
name: []const u8,
description: []const u8,
llvm_name: []const u8,
};
pub const hexagon_v5 = Feature{
.name = "v5",
.description = "Enable Hexagon V5 architecture",
.llvm_name = "v5",
};
pub const mips_crc = Feature{
.name = "crc",
.description = "Mips R6 CRC ASE",
.llvm_name = "crc",
};
// The index is the architecture. Could be improved to be
// initialized with comptime block in order to use @enumToInt
// on an architecture enum value.
pub const features = [_][]const *const Feature{
&[_]*const Feature{
&hexagon_v5,
// ...more hexagon features...
},
&[_]*const Feature{
&mips_crc,
// ...more mips features...
},
// ...more architectures...
};
// The index is the architecture. Could be improved to be
// initialized with comptime block in order to use @enumToInt
// on an architecture enum value.
pub const cpus = [_][]const *const Cpu{
&[_]Cpu{
.{
.name = "generic",
.features = &[_]*const Feature{
&hexagon_v5,
// ...more Generic features...
},
},
// ...more hexagon cpus...
},
&[_]Cpu{
.{
.name = "mips1",
.features = &[_]*const Feature{
&mips_crc,
// ...more mips1 features...
},
},
// ...more mips cpus...
},
// ...more architectures...
}; What does having enum and struct types for everything accomplish? I'm guessing it has to do with your plans for how to expose the information in
Can you elaborate on the use case with an example?
Note that there are 2 use cases here: there is compile-time where the code may do conditional compilation based on CPU features; here the comptime information would tell what CPU features were guaranteed at compile-time to be available. Another use case, is checking the CPU identification at runtime, and then populating these structures so that runtime branching logic can decide what to do, or potentially something like function multi versioning (#1018). It would be really nice to match the same data format for this feature as is used at compile-time. That can be a separate issue, however. |
It seemed like a good idea to place them into enums for Overall, your approach will work for the use cases I will cover in this PR, and it won't be too much of a hassle to switch to that representation. Additionally, it saves the weird reflection required to convert run-time-known enum values to their compile-time-known equivalents. It seems to make more sense for the most important use cases than what I have so far. In order to combat the EDIT: Also, multiple Zig |
I think we agree that we want the simpler way to represent the data. But simplicity is not one-dimensional, and it can be tricky to see the big picture. My mind went back and forth while typing this comment, I'm not entirely confident about the right way to go about this. I think what you have now is reasonable.
This seems perfectly fine to me; it's like rows in a database. This would allow a really simple linear list of CPUs and features. Another option would be to take my suggestion above, and have array indexes represent architectures, which may save some bytes in the generated binary which uses this info. I don't think I understand what you are suggesting though. I spent a lot of time looking at the diff just now, and my ultimate conclusion is that you're on the right track, and what you have now makes sense. |
Well, I've already mentioned my LLVM backend a number of times on IRC. It targets a 32-bit RISCish CPU, and currently pretends to be i686 in the eyes of the Zig compiler, as the data layout is nearly identical. However, I've also mentioned wanting to support, say, the Z80.
What I'm asking is this: would it be possible to specify e.g. |
If this means you're adding an LLVM target, then the CPU layout stuff would be on the LLVM side. This feature only (a) imports the CPUs and features defined by LLVM targets into zigland and (b) enables a zig user to specify the target CPU and enable/disable particular features. No CPU details beyond a name and a list of subfeatures are enumerated in this feature. Even if, say, pointer size were enumerated, this feature would not allow the user to change that on the fly since it only passes the chosen CPU name to LLVM (since that is the interface provided by the LLVM library). |
@layneson It's not on the LLVM side. I'm not using libLLVM. I have a parser written in Zig for LLVM IR, which is produced by a fully vanilla Zig compiler currently pretending to target i386 since that's the closest target.
Wait really? I though that custom CPU definitions were possible? |
From the perspective of the Zig compiler (stage1), when generating IR only, is it possible to adjust the CPU expectations without tampering with libLLVM? |
This is coming along nicely; I'm looking forward to merging this. |
1272297
to
1acb1fb
Compare
The run time stuff should match Linux's AT_HWCAP for non-x86 arches. |
Is there any way I can help with this? |
@andrewrk This is pretty much done as far as features are concerned. There is one part of this implementation that I do not like, however, so I would like your input on this: I switched to the representation you suggested here. It eliminated the need for generics and simplified most of the implementation. The only thing it made more difficult is exposing the target details in I have also taken an opinionated approach to the representation of these details (either a cpu is targeted, or a list of features are targeted, but never a combination of the two) in an attempt to simplify the feature/cpu relationship and distance Zig's implementation from LLVM in order to assist other backends in the future. Take a look at how that works ( |
Are you aware of @field? You can access declarations of structs by comptime string name.
👍 this sounds good |
I'm not quite sure how |
I'm having trouble understanding the example. (By the way, wouldn't If |
The problem I see with this is that we'd need a different struct for each arch (
Let me see if I can explain this better. I want the user to be able to see what details (features/cpu) were enabled when their program was compiled. Thus, I want to put this information in Since the features for each arch are enumerated as
In the absence of enums for all features/cpus, I opted for pointers to the "official" feature/cpu definitions. Then, in
When the compiler is run, all information about requested target details is provided in the form of strings. I can, say, parse a given cpu string by comparing it against the names of all defined Now that Assume that the user selected the
To do this, I need to generate the actual string |
If you have comptime-known comptime {
if (getEnabledFeature("jmpcall") != null) @compileError("this code requires jmpcall");
}
fn getEnabledFeature(name: []const u8) ?*std.builtin.CpuFeature {
for (std.builtin.target.enabled_cpu_features) |*cpu_feature| {
if (std.mem.eql(u8, cpu_feature.name, name)) return cpu_feature;
}
return null;
} That's pseudo-code-ish but hopefully explains the idea. We have a turing-complete language here that can run code at compile-time. As long as the information is exposed at all, then you can get access to it at compile-time. I still don't quite understand, but I think it's on me. I'll re-read your longer response again later and see if it clicks. I do think that if you are resorting to appending a suffix or a prefix, then that is a smell. You should be able to take advantage of actual namespaces (structs). |
This approach sounds perfectly reasonable to me. |
@andrewrk There is one thing listed in #2883 that I am not sure how to implement: passing cpus/features to Other than that, this PR covers #2883. |
Yes here is the relevant code: Lines 9114 to 9136 in 7e5e767
This will affect both translate-c and building C code. You should be able to take advantage of Excellent work here. Let me know when the PR is ready for review / merge. |
These CPUs and features exist though, right? Can't we include them, and then set the llvm_name to null? When Zig has both LLVM and non-LLVM backends, we'll have CPUs/features that LLVM potentially doesn't know about |
I think it is ok, based on LLVM source. The entry point for cpu/feature strings is here. The comma-separated features string is separated here. Then each entry in the features string is visited by the loop here, and passed to |
You should be able to reproduce locally like this (from the build directory):
The CI server's QEMU version appears to be 2.11. |
When I run your branch locally, with QEMU 4.1.0, I get this:
|
there's a swap file that made it into this PR: can you rebase that away? that could be problematic for vim users to be in the git history. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there are 1 or maybe 2 other places in codegen.cpp where the target cpu and features should be added to the cache. That could possibly be related to the test failures.
edit: just create_c_object_cache
needs the extra cache hash line. Probably unrelated to the test failure.
" --cpu [cpu] compile for [cpu] on the current target\n" | ||
" --features [feature_str] compile with features in [feature_str] on the current target\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's match clang's CLI here with -target-cpu
and -target-feature
if (g->target_details) { | ||
cache_str(&cache_hash, stage2_target_details_get_cache_str(g->target_details)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this be null? It seems like we should always have a (possibly empty) set of target features that are enabled. Is this because we don't yet have the capability to determine the native CPU / feature set?
@@ -2214,6 +2214,8 @@ struct CodeGen { | |||
|
|||
const char **clang_argv; | |||
size_t clang_argv_len; | |||
|
|||
Stage2TargetDetails *target_details; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved to ZigTarget struct in target.hpp
After adding --- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -9562,6 +9562,9 @@ Error create_c_object_cache(CodeGen *g, CacheHash **out_cache_hash, bool verbose
cache_int(cache_hash, g->zig_target->vendor);
cache_int(cache_hash, g->zig_target->os);
cache_int(cache_hash, g->zig_target->abi);
+ if (g->target_details) {
+ cache_str(cache_hash, stage2_target_details_get_cache_str(g->target_details));
+ } with test.c: int main(int argc, char **argv) {
return 0;
} I get:
This part looks wrong: Edit: there's also a trailing comma there. Not sure if that is a problem. |
// ABI warning | ||
export fn stage2_target_details_get_cache_str(target_details: ?*const Stage2TargetDetails) [*:0]const u8 { | ||
if (target_details) |td| { | ||
return @as([*:0]const u8, switch (td.target_details) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This @as
is redundant, it's implied by the fact that the return type is [*:0]const u8
Alright I fixed the CI failure. The problem was not forwarding target details to child CodeGen objects. E.g. when building compiler_rt and freestanding libc. --- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -10656,6 +10660,7 @@ CodeGen *create_child_codegen(CodeGen *parent_gen, Buf *root_src_path, OutType o
CodeGen *child_gen = codegen_create(nullptr, root_src_path, parent_gen->zig_target, out_type,
parent_gen->build_mode, parent_gen->zig_lib_dir, libc, get_global_cache_dir(), false, child_progress_node);
+ child_gen->target_details = parent_gen->target_details;
child_gen->root_out_name = buf_create_from_str(name);
child_gen->disable_gen_h = true;
child_gen->want_stack_check = WantStackCheckDisabled;
Also here's a patch to remove the trailing comma for one thing: --- a/src-self-hosted/stage1.zig
+++ b/src-self-hosted/stage1.zig
@@ -692,6 +688,9 @@ const Stage2TargetDetails = struct {
try builtin_str_buffer.append(",");
}
}
+ if (mem.endsWith(u8, llvm_features_buffer.toSliceConst(), ",")) {
+ llvm_features_buffer.shrink(llvm_features_buffer.len() - 1);
+ }
try builtin_str_buffer.append("}};");
|
Based on #3927 (comment) sounds like this is ready for review |
You did great work on this @layneson. I consider this to be ready for a hand-off, so whenever you feel done I will be happy to take it from there and land it in master branch. |
Thanks! I will clean up a few things you've listed and let you know when that's done. |
Previously, buffers were used with toOwnedSlice() to create c strings for LLVM cpu/feature strings. However, toOwnedSlice() shrinks the string memory to the buffer's length, which cuts off the null terminator. Now toSliceConst() is used instead, and the buffer is not deinited so that the string memory is not freed.
4e0e004
to
c31a7af
Compare
@andrewrk I rebased the swap file out and made the change to |
Alright I'm on it. Thank you for this huge patch! |
.dependencies = &[_]*const Feature { | ||
&feature_fuseLiterals, | ||
&feature_predictableSelectExpensive, | ||
&feature_customCheapAsMove, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to
this should be FeatureExynosCheapAsMoveHandling.
how did these lists get generated? looks like the data needs to be audited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crap, looks like you found a bug in my generation scripts... sorry about that. I took a look and it appears that there is an issue with features with dependencies not being included in the CPU feature list (but their dependencies make it). I fixed it and manually verified that this CPU is correct. Would you like me to push the updated CPU lists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generating these is a pain. I run LLVM's tablegen on each arch's .td
and parse the features/CPUs in there and resolve their feature bitmaps to get the final list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is your project open source that generates the data from the .tds? it might be nice to run it after every llvm release and look at a diff.
I went with a different organization, if you look at this branch: https://github.com/ziglang/zig/tree/layneson-cpus_and_features
The aarch64.zig file is updated to the refactored organization. If you were willing to adjust your scripts and run it according to this new data layout, that would save me a lot of time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main idea here was using a bit set to represent all the CPU features, rather than a slice of pointers to structs. This allows std.Target to maintain its current property that it does not need to be heap allocated no matter what the list of cpu features is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't realize until a few minutes ago (when I saw the livestream VOD) that the data layout was changed. I will clean up my scripts, modify them to output the new data layout, and put that on GitHub. It would be nice to have a way of verifying the final output, but I'm not immediately sure of a way to do that without repeating any mistakes/wrong assumptions from the original generation code.
One question: how did you generate the tag names for the Feature
enums? Did you go from the LLVM name and replace -
with _
? What about other characters (such as spaces and +
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heads up, I'm about to rebase this branch against master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question: how did you generate the tag names for the Feature enums? Did you go from the LLVM name and replace - with _? What about other characters (such as spaces and +)?
Yeah it's the LLVM name, replacing - with _ (following the style guide). When this would lead to an identifier being illegal, put it in @"foo"
(string literal identifier syntax).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewrk I rewrote my generation scripts. The project can now be found here. The output looks correct (to a few minutes of hand-verification), but even diffing your .zig files against my generated ones doesn't work since the ordering of feature and CPU defs differs between the two, so I am not certain that it is correct. Let me know if you'd like any changes / something doesn't look right!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll try this out now.
This is merged into #4264, which I'm actively working on merging into master. Would welcome your feedback on that. |
Landed in 96e5f47. Thank you for spearheading this! |
Sweet! Looking forward to using this. |
This PR adds definitions for target CPUs and features which are supported by LLVM (and addresses issue #2883). There are four main use cases for such target details:
build.zig
, a user might specify a CPU or specific features to be enabled, as either provided by a user or hard-coded.Use case 1 requires handling a run-time-known arch. Use case 2 requires iterating over all supported arches. Use case 3 warrants hard-coded, enum-defined CPUs and features. Use case 4 is covered by the mechanisms of both 1 and 3; user-defined details can be handled as in 1 and hard-coded details can be handled as in 3. This PR attempts to allow all 4 use cases.
In this PR, a "CPU" represents a specific processor. Thus, if Zig code targets a CPU, it is designed to run on that specific CPU and thus support all features provided by the CPU.
A "feature" is a single, toggleable trait that may not be supported by all CPUs of a given architecture. Although some features depend on others (for instance, AES support on X86 implies SSE support), no feature exists for the sole purpose of enabling a list of subfeatures. This is in contrast to LLVM, where many "ghost" features exist. As an example, the
avr
arch supports theavr1
feature, which doesn't directly correspond to a hardware trait, but rather indicates a list of features supported by the avr1 family of microcontrollers. I do not plan on adding support for "feature families" in this PR, although something like that could be added later if need be.The above representations of "CPU" and "feature" differ from those provided by LLVM, but this is good because (a) the above model is simpler and (b) the above model will enable easier integration of other backends into Zig.