Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for target details (CPUs and their supported features) #3927

Closed
wants to merge 26 commits into from

Conversation

layneson
Copy link
Contributor

@layneson layneson commented Dec 16, 2019

This PR adds definitions for target CPUs and features which are supported by LLVM (and addresses issue #2883). There are four main use cases for such target details:

  1. During compilation, a user might specify a CPU or specific features to be enabled;
  2. On the command-line, a user might ask the compiler to list supported CPUs or features for a given target;
  3. At run-time, a user might query for the available features on the current platform;
  4. In build.zig, a user might specify a CPU or specific features to be enabled, as either provided by a user or hard-coded.

Use case 1 requires handling a run-time-known arch. Use case 2 requires iterating over all supported arches. Use case 3 warrants hard-coded, enum-defined CPUs and features. Use case 4 is covered by the mechanisms of both 1 and 3; user-defined details can be handled as in 1 and hard-coded details can be handled as in 3. This PR attempts to allow all 4 use cases.

In this PR, a "CPU" represents a specific processor. Thus, if Zig code targets a CPU, it is designed to run on that specific CPU and thus support all features provided by the CPU.

A "feature" is a single, toggleable trait that may not be supported by all CPUs of a given architecture. Although some features depend on others (for instance, AES support on X86 implies SSE support), no feature exists for the sole purpose of enabling a list of subfeatures. This is in contrast to LLVM, where many "ghost" features exist. As an example, the avr arch supports the avr1 feature, which doesn't directly correspond to a hardware trait, but rather indicates a list of features supported by the avr1 family of microcontrollers. I do not plan on adding support for "feature families" in this PR, although something like that could be added later if need be.

The above representations of "CPU" and "feature" differ from those provided by LLVM, but this is good because (a) the above model is simpler and (b) the above model will enable easier integration of other backends into Zig.

@pixelherodev
Copy link
Contributor

Question: would this allow specifying a custom CPU data layout?

@andrewrk
Copy link
Member

andrewrk commented Dec 18, 2019

Why not more general types for CPUs and features?

e.g. rather than:

pub const HexagonCpu = enum {
    Generic,
    Hexagonv5,
    Hexagonv55,
    Hexagonv60,
    Hexagonv62,
    Hexagonv65,
    Hexagonv66, 
};

// ...

pub const MipsCpu = enum {
    Mips1,
    Mips2,
    Mips3,
    Mips32,
    Mips32r2,
    Mips32r3,
    Mips32r5,
    Mips32r6,
    Mips4,
    Mips5,
    Mips64,
    Mips64r2,
    Mips64r3,
    Mips64r5,
    Mips64r6,
    Octeon,
    P5600, 
};

// ...


pub const MipsFeature = enum {
    Abs2008,
    Crc,
    Cnmips,
    Dsp,
    Dspr2,
    Dspr3,
    Eva,
    Fp64, 
// ...

...and then a bunch of reflection, instead something like:

pub const Cpu = struct {
    name: []const u8,
    features: []const *const Feature,
};
pub const Feature = struct {
    name: []const u8,
    description: []const u8,
    llvm_name: []const u8,
};
pub const hexagon_v5 = Feature{
    .name = "v5",
    .description = "Enable Hexagon V5 architecture",
    .llvm_name = "v5",
};
pub const mips_crc = Feature{
    .name = "crc",
    .description = "Mips R6 CRC ASE",
    .llvm_name = "crc",
};
// The index is the architecture. Could be improved to be
// initialized with comptime block in order to use @enumToInt
// on an architecture enum value.
pub const features = [_][]const *const Feature{
    &[_]*const Feature{
        &hexagon_v5,
        // ...more hexagon features...
    },
    &[_]*const Feature{
        &mips_crc,
        // ...more mips features...
    },
    // ...more architectures...
};
// The index is the architecture. Could be improved to be
// initialized with comptime block in order to use @enumToInt
// on an architecture enum value.
pub const cpus = [_][]const *const Cpu{
    &[_]Cpu{
        .{
            .name = "generic",
            .features = &[_]*const Feature{
                &hexagon_v5,
                // ...more Generic features...
            },
        },
        // ...more hexagon cpus...
    },
    &[_]Cpu{
        .{
            .name = "mips1",
            .features = &[_]*const Feature{
                &mips_crc,
                // ...more mips1 features...
            },
        },
        // ...more mips cpus...
    },
    // ...more architectures...
};

What does having enum and struct types for everything accomplish? I'm guessing it has to do with your plans for how to expose the information in builtin.zig?


Question: would this allow specifying a custom CPU data layout?

Can you elaborate on the use case with an example?

At run-time, a user might query for the available features on the current platform;

Note that there are 2 use cases here: there is compile-time where the code may do conditional compilation based on CPU features; here the comptime information would tell what CPU features were guaranteed at compile-time to be available.

Another use case, is checking the CPU identification at runtime, and then populating these structures so that runtime branching logic can decide what to do, or potentially something like function multi versioning (#1018). It would be really nice to match the same data format for this feature as is used at compile-time. That can be a separate issue, however.

@layneson
Copy link
Contributor Author

layneson commented Dec 18, 2019

What does having enum and struct types for everything accomplish? I'm guessing it has to do with your plans for how to expose the information in builtin.zig?

It seemed like a good idea to place them into enums for builtin.zig and for use in build.zig files. In the former case, a CPU and a list of supported features would be baked in and generic on the Target.Arch also defined in builtin.zig. As for build scripts, having enums seems preferable to constant values since the former (a) makes it very clear what CPUs/features are available for the chosen target arch and (b) enables a compile-time check that the requested CPUs/features are for the correct arch. Of course, with Zig's compile-time capabilities, this check is still possible with the scheme you presented above, but it is less clear to the user for the specific case in which they wish to hard-code CPUs/features. (Another way of wording it: instead of using enum values to represent unique CPUs/features, one must use pointers. The pointer type doesn't have any inherent Arch information, while my original scheme is generic on the Arch. Your approach seems less clean in this particular case, but will certainly work.)

Overall, your approach will work for the use cases I will cover in this PR, and it won't be too much of a hassle to switch to that representation. Additionally, it saves the weird reflection required to convert run-time-known enum values to their compile-time-known equivalents. It seems to make more sense for the most important use cases than what I have so far.

In order to combat the build.zig issue presented above, I will try encoding the parent Arch within each Feature and CPU. It seems very redundant, so let me know if there is a better way to represent this "backwards" mapping (CPU/feature -> Arch).

EDIT: Also, multiple Zig Arch values occasionally map to a single set of CPUs and features. This slightly complicates the backwards mapping scheme.

@andrewrk
Copy link
Member

I think we agree that we want the simpler way to represent the data. But simplicity is not one-dimensional, and it can be tricky to see the big picture. My mind went back and forth while typing this comment, I'm not entirely confident about the right way to go about this. I think what you have now is reasonable.

I will try encoding the parent Arch within each Feature and CPU. It seems very redundant, so let me know if there is a better way to represent this "backwards" mapping.

This seems perfectly fine to me; it's like rows in a database. This would allow a really simple linear list of CPUs and features. Another option would be to take my suggestion above, and have array indexes represent architectures, which may save some bytes in the generated binary which uses this info.

I don't think I understand what you are suggesting though. I spent a lot of time looking at the diff just now, and my ultimate conclusion is that you're on the right track, and what you have now makes sense.

@pixelherodev
Copy link
Contributor

Can you elaborate on the use case with an example?

Well, I've already mentioned my LLVM backend a number of times on IRC. It targets a 32-bit RISCish CPU, and currently pretends to be i686 in the eyes of the Zig compiler, as the data layout is nearly identical.

However, I've also mentioned wanting to support, say, the Z80.

Add command-line support for specifying a CPU and optional features

What I'm asking is this: would it be possible to specify e.g. --cpu-pointer-size=16 --cpu-native-size=8 (though obviously not with that horrible syntax)?

@layneson
Copy link
Contributor Author

layneson commented Dec 21, 2019

Well, I've already mentioned my LLVM backend a number of times on IRC

If this means you're adding an LLVM target, then the CPU layout stuff would be on the LLVM side. This feature only (a) imports the CPUs and features defined by LLVM targets into zigland and (b) enables a zig user to specify the target CPU and enable/disable particular features. No CPU details beyond a name and a list of subfeatures are enumerated in this feature. Even if, say, pointer size were enumerated, this feature would not allow the user to change that on the fly since it only passes the chosen CPU name to LLVM (since that is the interface provided by the LLVM library).

@pixelherodev
Copy link
Contributor

@layneson It's not on the LLVM side. I'm not using libLLVM.

I have a parser written in Zig for LLVM IR, which is produced by a fully vanilla Zig compiler currently pretending to target i386 since that's the closest target.

it only passes the chosen CPU name to LLVM (since that is the interface provided by the LLVM library)

Wait really? I though that custom CPU definitions were possible?

@pixelherodev
Copy link
Contributor

From the perspective of the Zig compiler (stage1), when generating IR only, is it possible to adjust the CPU expectations without tampering with libLLVM?

@andrewrk
Copy link
Member

andrewrk commented Jan 7, 2020

This is coming along nicely; I'm looking forward to merging this.

@shawnl
Copy link
Contributor

shawnl commented Jan 13, 2020

The run time stuff should match Linux's AT_HWCAP for non-x86 arches.

@andrewrk
Copy link
Member

Is there any way I can help with this?

@layneson
Copy link
Contributor Author

@andrewrk This is pretty much done as far as features are concerned. There is one part of this implementation that I do not like, however, so I would like your input on this:

I switched to the representation you suggested here. It eliminated the need for generics and simplified most of the implementation. The only thing it made more difficult is exposing the target details in builtin.zig... since details (features/cpus) are not enumerated, there is no clean way to go from, say, *const Feature to a specific feature definition in the standard library (std.target.avr.feature_jmpcall, for example). What I did here was ensure that each Feature's name field matches its corresponding const variable name and then use the name field to derive the variable name to place in builtin.zig. Example: I tell the compiler to enable the jmpcall feature. That feature is defined at std.target.avr.feature_jmpcall. So the compiler adds the std.target.avr.feature_ prefix to jmpcall and puts that in the builtin.zig source. Thus, a user could potentially pass a pointer to their own Feature or Cpu struct in build.zig, and this would cause a compilation error down the road since their string name might not match up with one of the detail definitions in the standard library. I have yet to think of a clever way to use reflection to go from a Feature/Cpu pointer to a definition in the standard library. This is the one place that having enums for the allowed features/cpus would have been helpful.

I have also taken an opinionated approach to the representation of these details (either a cpu is targeted, or a list of features are targeted, but never a combination of the two) in an attempt to simplify the feature/cpu relationship and distance Zig's implementation from LLVM in order to assist other backends in the future. Take a look at how that works (std.target.TargetDetails) and let me know if you think that approach is not desirable.

@andrewrk
Copy link
Member

The only thing it made more difficult is exposing the target details in builtin.zig... since details (features/cpus) are not enumerated, there is no clean way to go from, say, *const Feature to a specific feature definition in the standard library (``

Are you aware of @field? You can access declarations of structs by comptime string name.

I have also taken an opinionated approach to the representation of these details (either a cpu is targeted, or a list of features are targeted, but never a combination of the two)

👍 this sounds good

@layneson
Copy link
Contributor Author

layneson commented Jan 15, 2020

Are you aware of @field? You can access declarations of structs by comptime string name.

I'm not quite sure how @field will help me here. To use it, I would need the field name. However, my concern is with properly selecting that field name in the first place.

@andrewrk
Copy link
Member

I'm having trouble understanding the example. (By the way, wouldn't feature or features be a struct, and then the features could be their own declarations? e.g. pub const features = struct { pub const jmpcall = true; } rather than feature_ prefix.)

If Feature has a pub const name = "foo"; then you can use some combination of @hasDecl or @field to get from a Feature to a related thing. I'm pretty sure the reflection capabilities are there. If I can understand better the problem I would be happy to suggest what it would look like.

@layneson
Copy link
Contributor Author

layneson commented Jan 15, 2020

By the way, wouldn't feature or features be a struct, and then the features could be their own declarations?

The problem I see with this is that we'd need a different struct for each arch (AvrFeatures, X86Features, etc.), and then we are in the generics/reflection zone I purposely avoided.

If I can understand better the problem I would be happy to suggest what it would look like.

Let me see if I can explain this better.

I want the user to be able to see what details (features/cpu) were enabled when their program was compiled. Thus, I want to put this information in builtin.zig. This is parallel to including information about the target in builtin.zig. I created the TargetDetails type (defined here) to represent this (TargetDetails is also used in build.zig).

Since the features for each arch are enumerated as const feature_whatever = Feature { ... }, I made TargetDetails use Feature/Cpu pointers. Thus, specifying what Cpu to use in build.zig would look something like this:

exe.setTargetDetails(std.target.TargetDetails{
    .cpu = &std.target.avr.cpu_atmega328p,
});

In the absence of enums for all features/cpus, I opted for pointers to the "official" feature/cpu definitions. Then, in builtin.zig, something like this would be present:

const target_details: ?std.target.TargetDetails = .{
    .cpu = &std.target.avr.cpu_atmega328p,
};

When the compiler is run, all information about requested target details is provided in the form of strings. I can, say, parse a given cpu string by comparing it against the names of all defined Cpus for the given arch (this information is listed in arrays, see an example here). This will get me a *const Cpu. This can, in turn, be transformed into a TargetDetails.

Now that TargetDetails, which lives in the compiler, must be put in builtin.zig. Thus, it must be converted to a string representation during the builtin.zig code generation step. The issue here is now mapping the pointers provided in TargetDetails to the actual feature/cpu definitions (such as const cpu_atmega328p = ...) that live in the standard library.

Assume that the user selected the atmega328p cpu. I want the builtin.zig TargetDetails entry to look something like:

const target_details: ?std.target.TargetDetails = .{
    .cpu = &std.target.avr.cpu_atmega328p,
};

To do this, I need to generate the actual string std.target.avr.cpu_atmega328p. Getting this from a pointer such as const ptr = &std.target.avr.cpu_atmega328p at runtime (in the compiler) is the issue I am having with this current approach. My current approach is to use the cpu's name (ptr.name, which would be atmega328p in this example) and just throw the prefix std.target.avr.cpu_ on the front.

@andrewrk
Copy link
Member

If you have comptime-known features of type []*const Feature, you can solve the problem of seeing what features are enabled at comptime:

comptime {
    if (getEnabledFeature("jmpcall") != null) @compileError("this code requires jmpcall");
}

fn getEnabledFeature(name: []const u8) ?*std.builtin.CpuFeature {
    for (std.builtin.target.enabled_cpu_features) |*cpu_feature| {
        if (std.mem.eql(u8, cpu_feature.name, name)) return cpu_feature;
    }
    return null;
}

That's pseudo-code-ish but hopefully explains the idea.

We have a turing-complete language here that can run code at compile-time. As long as the information is exposed at all, then you can get access to it at compile-time.

I still don't quite understand, but I think it's on me. I'll re-read your longer response again later and see if it clicks. I do think that if you are resorting to appending a suffix or a prefix, then that is a smell. You should be able to take advantage of actual namespaces (structs).

@andrewrk
Copy link
Member

My current approach is to use the cpu's name (ptr.name, which would be atmega328p in this example)

This approach sounds perfectly reasonable to me.

@layneson
Copy link
Contributor Author

@andrewrk There is one thing listed in #2883 that I am not sure how to implement: passing cpus/features to zig cc. I am assuming this is referring to where the c compiler is invoked during the compilation process (to build musl, for example). Is this assumption correct? Is there a single entry point in the compiler for that kind of cc invocation?

Other than that, this PR covers #2883.

@andrewrk
Copy link
Member

Yes here is the relevant code:

zig/src/codegen.cpp

Lines 9114 to 9136 in 7e5e767

if (g->zig_target->is_native) {
if (target_supports_clang_march_native(g->zig_target)) {
args.append("-march=native");
}
} else {
args.append("-target");
args.append(buf_ptr(&g->llvm_triple_str));
if (target_is_musl(g->zig_target) && target_is_riscv(g->zig_target)) {
// Musl depends on atomic instructions, which are disabled by default in Clang/LLVM's
// cross compilation CPU info for RISCV.
// TODO: https://github.com/ziglang/zig/issues/2883
args.append("-Xclang");
args.append("-target-feature");
args.append("-Xclang");
args.append(riscv_default_features);
} else if (g->zig_target->os == OsFreestanding && g->zig_target->arch == ZigLLVM_x86) {
args.append("-Xclang");
args.append("-target-feature");
args.append("-Xclang");
args.append("-sse");
}
}

This will affect both translate-c and building C code. You should be able to take advantage of llvm_name for this I believe. The idea is to communicate the same target information to clang as we did to llvm.

Excellent work here. Let me know when the PR is ready for review / merge.

@andrewrk
Copy link
Member

andrewrk commented Jan 16, 2020

Remove features/cpus not in LLVM v9

These CPUs and features exist though, right? Can't we include them, and then set the llvm_name to null?

When Zig has both LLVM and non-LLVM backends, we'll have CPUs/features that LLVM potentially doesn't know about

@layneson
Copy link
Contributor Author

Maybe it's not correct to emit the "-" features and then re-enable them later?

I think it is ok, based on LLVM source. The entry point for cpu/feature strings is here. The comma-separated features string is separated here. Then each entry in the features string is visited by the loop here, and passed to ApplyFeatureFlag. Finally, a bit representing that feature is set/unset in a bitset here. Since the features are processed in the order they come, and since each one sets/clears a bit in the feature bitset, it seems to me that the "-" features would unset the proper bits, and then set those bits again if they are re-enabled.

@andrewrk
Copy link
Member

You should be able to reproduce locally like this (from the build directory):

./zig test ../test/stage1/behavior.zig -target i386-linux-none --test-cmd qemu-i386 --test-cmd-bin

The CI server's QEMU version appears to be 2.11.

@andrewrk
Copy link
Member

When I run your branch locally, with QEMU 4.1.0, I get this:

[nix-shell:~/dev/zig/build]$ ./zig test ../test/stage1/behavior.zig -target i386-linux-none --test-cmd qemu-i386 --test-cmd-bin
Test [386/856] behavior.floatop.test "@floor"...test failure
/home/andy/dev/zig/lib/std/testing.zig:177:14: 0x451071 in std.testing.expect (test)
    if (!ok) @panic("test failure");
             ^
/home/andy/dev/zig/test/stage1/behavior/floatop.zig:334:15: 0x46e09e in behavior.floatop.testFloor (test)
        expect(@floor(a) == 3);
              ^
/home/andy/dev/zig/test/stage1/behavior/floatop.zig:318:14: 0x45840a in behavior.floatop.test "@floor" (test)
    testFloor();
             ^
/home/andy/dev/zig/lib/std/special/test_runner.zig:22:25: 0x4bd529 in std.special.main (test)
        if (test_fn.func()) |_| {
                        ^
/home/andy/dev/zig/lib/std/start.zig:259:37: 0x477055 in std.start.posixCallMainAndExit (test)
            const result = root.main() catch |err| {
                                    ^
/home/andy/dev/zig/lib/std/start.zig:123:5: 0x476f0b in std.start._start (test)
    @call(.{ .modifier = .never_inline }, posixCallMainAndExit, .{});
    ^
qemu: uncaught target signal 6 (Aborted) - core dumped

Tests failed. Use the following command to reproduce the failure:
/home/andy/dev/zig/build/zig-cache/o/l1lkNG4MwaVtIqlr1avuoRBQwCpFEEW4DBO5ctOaTdMGPKyZIhsq5RWidbN_muGs/test

@andrewrk
Copy link
Member

andrewrk commented Jan 16, 2020

there's a swap file that made it into this PR: lib/std/.builtin.zig.swp

can you rebase that away? that could be problematic for vim users to be in the git history.

how to set up global .gitignore

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there are 1 or maybe 2 other places in codegen.cpp where the target cpu and features should be added to the cache. That could possibly be related to the test failures.

edit: just create_c_object_cache needs the extra cache hash line. Probably unrelated to the test failure.

Comment on lines +103 to +104
" --cpu [cpu] compile for [cpu] on the current target\n"
" --features [feature_str] compile with features in [feature_str] on the current target\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's match clang's CLI here with -target-cpu and -target-feature

Comment on lines +8657 to +8659
if (g->target_details) {
cache_str(&cache_hash, stage2_target_details_get_cache_str(g->target_details));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this be null? It seems like we should always have a (possibly empty) set of target features that are enabled. Is this because we don't yet have the capability to determine the native CPU / feature set?

@@ -2214,6 +2214,8 @@ struct CodeGen {

const char **clang_argv;
size_t clang_argv_len;

Stage2TargetDetails *target_details;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to ZigTarget struct in target.hpp

@andrewrk
Copy link
Member

andrewrk commented Jan 17, 2020

After adding

--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -9562,6 +9562,9 @@ Error create_c_object_cache(CodeGen *g, CacheHash **out_cache_hash, bool verbose
     cache_int(cache_hash, g->zig_target->vendor);
     cache_int(cache_hash, g->zig_target->os);
     cache_int(cache_hash, g->zig_target->abi);
+    if (g->target_details) {
+        cache_str(cache_hash, stage2_target_details_get_cache_str(g->target_details));
+    }

with test.c:

int main(int argc, char **argv) {
    return 0;
} 

I get:

[nix-shell:~/dev/zig/build]$ ./zig build-exe --c-source test.c -target i386-linux-gnu --verbose-cc
/home/andy/dev/zig/build/zig cc -MD -MV -MF zig-cache/tmp/0l9Pf4ONINI--test.o.d -nostdinc -fno-spell-checking -isystem /home/andy/dev/zig/lib/include -target i386-unknown-linux-gnu -Xclang -target-cpu -Xclang  -Xclang -target-feature -Xclang -3dnow,-3dnowa,-64bit,-adx,-aes,-avx,-avx2,-avx512f,-avx512bf16,-avx512bitalg,-bmi,-bmi2,-avx512bw,-branchfusion,-avx512cd,-cldemote,-clflushopt,-clwb,-clzero,-cmov,-cx8,-cx16,-avx512dq,-enqcmd,-avx512er,-ermsb,-f16c,-fma,-fma4,-fsgsbase,-fxsr,-fast-11bytenop,-fast-15bytenop,-fast-bextr,-fast-hops,-fast-lzcnt,-fast-partial-ymm-or-zmm-write,-fast-shld-rotate,-fast-scalar-fsqrt,-fast-scalar-shift-masks,-fast-variable-shuffle,-fast-vector-fsqrt,-fast-vector-shift-masks,-gfni,-fast-gather,-avx512ifma,-invpcid,-sahf,-lea-sp,-lea-uses-ag,-lwp,-lzcnt,-false-deps-lzcnt-tzcnt,-mmx,-movbe,-movdir64b,-movdiri,-mpx,-mwaitx,-macrofusion,-merge-to-threeway-branch,-nopl,-pclmul,-pconfig,-avx512pf,-pku,-popcnt,-false-deps-popcnt,-prefetchwt1,-prfchw,-ptwrite,-pad-short-functions,-prefer-256-bit,-rdpid,-rdrnd,-rdseed,-rtm,-retpoline,-retpoline-external-thunk,-retpoline-indirect-branches,-retpoline-indirect-calls,-sgx,-sha,-shstk,-sse,-sse2,-sse3,-sse4a,-sse4.1,-sse4.2,-sse-unaligned-mem,-ssse3,-slow-3ops-lea,-idivl-to-divb,-idivq-to-divl,-slow-incdec,-slow-lea,-slow-pmaddwd,-slow-pmulld,-slow-shld,-slow-two-mem-ops,-slow-unaligned-mem-16,-slow-unaligned-mem-32,-soft-float,-tbm,-vaes,-avx512vbmi,-avx512vbmi2,-avx512vl,-avx512vnni,-avx512vp2intersect,-vpclmulqdq,-avx512vpopcntdq,-waitpkg,-wbnoinvd,-x87,-xop,-xsave,-xsavec,-xsaveopt,-xsaves,-16bit-mode,-32bit-mode,-64bit-mode,+cmov,+cx8,+fxsr,+mmx,+nopl,+sse,+sse2,+slow-unaligned-mem-16,+x87, -g -fno-omit-frame-pointer -fsanitize=undefined -fsanitize-trap=undefined -D_DEBUG -fno-stack-protector -o zig-cache/tmp/0l9Pf4ONINI--test.o -c test.c

This part looks wrong: -Xclang -target-cpu -Xclang -Xclang -target-feature. Maybe it's fine since we pass a null terminated empty string. But it still looks weird; I would like to omit the parameter if CPU name is not available.

Edit: there's also a trailing comma there. Not sure if that is a problem.

// ABI warning
export fn stage2_target_details_get_cache_str(target_details: ?*const Stage2TargetDetails) [*:0]const u8 {
if (target_details) |td| {
return @as([*:0]const u8, switch (td.target_details) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This @as is redundant, it's implied by the fact that the return type is [*:0]const u8

@andrewrk
Copy link
Member

andrewrk commented Jan 17, 2020

Alright I fixed the CI failure. The problem was not forwarding target details to child CodeGen objects. E.g. when building compiler_rt and freestanding libc.

--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -10656,6 +10660,7 @@ CodeGen *create_child_codegen(CodeGen *parent_gen, Buf *root_src_path, OutType o
     CodeGen *child_gen = codegen_create(nullptr, root_src_path, parent_gen->zig_target, out_type,
         parent_gen->build_mode, parent_gen->zig_lib_dir, libc, get_global_cache_dir(), false, child_progress_node);
+    child_gen->target_details = parent_gen->target_details;
     child_gen->root_out_name = buf_create_from_str(name);
     child_gen->disable_gen_h = true;
     child_gen->want_stack_check = WantStackCheckDisabled;

Also here's a patch to remove the trailing comma for one thing:

--- a/src-self-hosted/stage1.zig
+++ b/src-self-hosted/stage1.zig
@@ -692,6 +688,9 @@ const Stage2TargetDetails = struct {
                 try builtin_str_buffer.append(",");
             }
         }
+        if (mem.endsWith(u8, llvm_features_buffer.toSliceConst(), ",")) {
+            llvm_features_buffer.shrink(llvm_features_buffer.len() - 1);
+        }
 
         try builtin_str_buffer.append("}};");
 

@andrewrk
Copy link
Member

Based on #3927 (comment) sounds like this is ready for review

@andrewrk andrewrk marked this pull request as ready for review January 17, 2020 00:31
@andrewrk
Copy link
Member

You did great work on this @layneson. I consider this to be ready for a hand-off, so whenever you feel done I will be happy to take it from there and land it in master branch.

@layneson
Copy link
Contributor Author

Thanks! I will clean up a few things you've listed and let you know when that's done.

Previously, buffers were used with toOwnedSlice() to create c strings
for LLVM cpu/feature strings. However, toOwnedSlice() shrinks the
string memory to the buffer's length, which cuts off the null terminator.
Now toSliceConst() is used instead, and the buffer is not deinited
so that the string memory is not freed.
@layneson
Copy link
Contributor Author

layneson commented Jan 17, 2020

@andrewrk I rebased the swap file out and made the change to fix the test fix the @floor testcase. I think its best for you to take it from here.

@andrewrk
Copy link
Member

Alright I'm on it. Thank you for this huge patch!

.dependencies = &[_]*const Feature {
&feature_fuseLiterals,
&feature_predictableSelectExpensive,
&feature_customCheapAsMove,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to

https://github.com/llvm/llvm-project/blob/release/9.x/llvm/lib/Target/AArch64/AArch64.td#L582

this should be FeatureExynosCheapAsMoveHandling.

how did these lists get generated? looks like the data needs to be audited.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crap, looks like you found a bug in my generation scripts... sorry about that. I took a look and it appears that there is an issue with features with dependencies not being included in the CPU feature list (but their dependencies make it). I fixed it and manually verified that this CPU is correct. Would you like me to push the updated CPU lists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating these is a pain. I run LLVM's tablegen on each arch's .td and parse the features/CPUs in there and resolve their feature bitmaps to get the final list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is your project open source that generates the data from the .tds? it might be nice to run it after every llvm release and look at a diff.

I went with a different organization, if you look at this branch: https://github.com/ziglang/zig/tree/layneson-cpus_and_features

The aarch64.zig file is updated to the refactored organization. If you were willing to adjust your scripts and run it according to this new data layout, that would save me a lot of time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main idea here was using a bit set to represent all the CPU features, rather than a slice of pointers to structs. This allows std.Target to maintain its current property that it does not need to be heap allocated no matter what the list of cpu features is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realize until a few minutes ago (when I saw the livestream VOD) that the data layout was changed. I will clean up my scripts, modify them to output the new data layout, and put that on GitHub. It would be nice to have a way of verifying the final output, but I'm not immediately sure of a way to do that without repeating any mistakes/wrong assumptions from the original generation code.

One question: how did you generate the tag names for the Feature enums? Did you go from the LLVM name and replace - with _? What about other characters (such as spaces and +)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heads up, I'm about to rebase this branch against master

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question: how did you generate the tag names for the Feature enums? Did you go from the LLVM name and replace - with _? What about other characters (such as spaces and +)?

Yeah it's the LLVM name, replacing - with _ (following the style guide). When this would lead to an identifier being illegal, put it in @"foo" (string literal identifier syntax).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewrk I rewrote my generation scripts. The project can now be found here. The output looks correct (to a few minutes of hand-verification), but even diffing your .zig files against my generated ones doesn't work since the ordering of feature and CPU defs differs between the two, so I am not certain that it is correct. Let me know if you'd like any changes / something doesn't look right!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll try this out now.

@andrewrk
Copy link
Member

This is merged into #4264, which I'm actively working on merging into master. Would welcome your feedback on that.

@andrewrk andrewrk closed this Jan 22, 2020
@andrewrk
Copy link
Member

Landed in 96e5f47. Thank you for spearheading this!

@layneson
Copy link
Contributor Author

Sweet! Looking forward to using this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants