goalc: Some more macOS ARM64 work #3290

xTVaser · 2024-01-03T22:59:45Z

Not a whole lot in this PR, but I want to get this off my laptop where it's been sitting for a few months.

Started to split apart the implementation for different architectures and get CodeTester tests passing. At one point I proved to myself that I got it working (it being a single simple instruction) but then things fell apart when i began the grueling refactor.

…ut arm instructions

water111

Nice, I think this is a good start.

My main concern at this point is that ARM is actually quite a bit different from x86, so it will be hard to fit the ARM instructions into the existing pattern. I also think that using ifdefs to change between arm/x86 code generation will make it harder to maintain.

I'm looking through the compiler code now, and I'm wondering if we should just do the division at the codegen_object_file level - just totally new code for x86 and ARM at that point. We'll need a new linking format, different logic for picking instructions, different function prologues/epilogues, etc. With this approach, we'd probably want each IR to support an do_codegen_x86 and do_codegen_arm version, which would give a x86Instruction on x86 or ArmInstruction on ARM. I haven't thought this through fully yet.

water111 · 2024-01-21T22:17:52Z

goalc/compiler/CodeGenerator.cpp

-    if (saved_reg.is_xmm()) {
+    if (saved_reg.is_128bit_simd()) {


(unrelated to your changes here) I think that we will probably want an entirely separate do_goal_function for x86 vs. ARM. I think the way that the stack pointer is manipulated will be different, and this currently has some x86-specific things like sub_gpr64_imm8s which is using an x86-only smaller size instruction because an immediate is only 8-bits.

water111 · 2024-01-21T22:21:58Z

goalc/compiler/CodeGenerator.cpp

-        if (saved_reg.is_xmm()) {
+        if (saved_reg.is_128bit_simd()) {


It might be better to keep these as is_xmm. I think having both an is_xmm (true only for x86 XMM) and an is_128bit_simd (true for both arm/x86 vector regs) is good, rather than trying to have a abstraction over "general purpose" and "128bit" registers at this level.

ARM will probably have different code for loading/storing register to the stack. It has instructions for storing multiple registers at once, rather than a sequence of push instructions like we used on x86.

water111 · 2024-01-21T22:30:53Z

goalc/compiler/IR.cpp

@@ -862,7 +862,8 @@ RegAllocInstr IR_ConditionalBranch::to_rai() {
 void IR_ConditionalBranch::do_codegen(emitter::ObjectGenerator* gen,
                                      const AllocationResult& allocs,
                                      emitter::IR_Record irec) {
-  Instruction jump_instr(0);
+#ifndef __aarch64__


I'd rather keep the ability for the compiler to generate either ARM or x86 when built on either ARM or x86.

In C++, the code inside the #else here won't be checked by the compiler at all on x86, which will make it harder for developers on x86 to know if their changes broke the build on ARM.

water111 · 2024-01-21T22:31:13Z

goalc/compiler/compilation/Function.cpp

-  if (cc.return_reg && cc.return_reg->is_xmm()) {
+  if (cc.return_reg && cc.return_reg->is_128bit_simd()) {


👍 I think this is a good place to use is_128bit_simd over is_xmm

water111 · 2024-01-21T22:36:50Z

goalc/debugger/disassemble.h

-  InstructionInfo(const emitter::Instruction& _instruction, Kind _kind, int _ir_idx)
+  InstructionInfo(const emitter::Instruction _instruction, Kind _kind, int _ir_idx)


Based on what I've read above, I think that you have classes which inherit from Instruction, and you'll pass them here. This will cause the copy constructor of Instruction (the base class) to be called, which will cause slicing (see https://stackoverflow.com/questions/274626/what-is-object-slicing#274636). This ends up being undefined behavior and probably means that any virtual methods of Instruction that you call will crash or silently do the wrong thing.

The way around this is usually to pass a std::unique_ptr<Instruction> or std::shared_ptr<Instruction>... but it might be worth checking what we actually need an instruction for here. As far as I see, we just need the length, rather than the whole instruction.

(oops - I think I'm wrong since Instruction is just InstructionX86 or InstructionARM)
In that case, can we keep this as a const ref?

water111 · 2024-01-21T22:38:59Z

goalc/emitter/CodeTester.cpp

@@ -67,23 +67,37 @@ void CodeTester::emit_return() {
 * Pops RSP always, which is weird, but doesn't cause issues.
 */
 void CodeTester::emit_pop_all_gprs(bool exclude_rax) {
+#ifndef __aarch64__


I think it would be better not to have ifdef ARM and instead have some runtime variable. Otherwise it'll be really hard to make sure goalc actually builds on x86 and ARM.

water111 · 2024-01-21T22:39:59Z

goalc/emitter/CodeTester.cpp

+#if defined(__APPLE__) && defined(__aarch64__)
+  mprotect(code_buffer, code_buffer_capacity, PROT_EXEC | PROT_READ);
+  auto ret = ((u64(*)())code_buffer)();
+  mprotect(code_buffer, code_buffer_capacity, PROT_WRITE | PROT_READ);
+  return ret;
+#else


👍 this is a good place to use the ifdef - it's relatively simple, and the mprotect doesn't exist on windows.

water111 · 2024-01-21T22:41:51Z

goalc/emitter/IGen.h

+/*!
+ * Move data from src to dst. Moves all 64-bits of the GPR.
+ */
+extern Instruction mov_gpr64_gpr64(Register dst, Register src);


you don't need extern here.

water111 · 2024-01-21T22:44:22Z

goalc/emitter/IGenX86.cpp

+  ASSERT(src.is_gpr());
+  InstructionX86 instr(0x89);
+  instr.set_modrm_and_rex(src.hw_id(), dst.hw_id(), 3, true);
+  return instr;


Returning an InstructionX86 by value like this will also cause slicing

water111 · 2024-01-21T22:45:35Z

goalc/emitter/IGenX86.cpp

+
+Instruction movd_gpr32_xmm32(Register dst, Register src) {
+  ASSERT(dst.is_gpr());
+  ASSERT(src.is_128bit_simd());


This is a case when we should keep is_xmm I think. In this case, we're only moving the lower 32 bits between an XMM and GPR, so the xmm is being used as a 32-bit floating point register. I tried to keep the concept of "register type" and "register size" separate, since there's a lot of cases where you only use part of a register (x86 GPRs are 64-bits, but for some goal ops we treat them as 32-bits too).

xTVaser added 11 commits October 26, 2023 13:47

arm: fix arm64 compilation errors

c63e6bd

task: add some tasks to make building via command line faster

0b667d3

arm: start working through CodeTester file

5a0cd2d

arm-wip: start trying to abstract instruction emission code

dc863b2

arm-wip: figured out a minimally invasive instruction pattern, stub o…

e57d430

…ut arm instructions

goalc: start the process of splitting up architecture impl

5581536

Merge remote-tracking branch 'origin/master' into v/arm-work-2

7f3b950

goalc: remove some temp changes

19b5eea

ci: build MacOS arm64 on x86 runners

1e281d3

ci: disable macOS ARM tests for now

0ca6a5a

ci: no need to install rosetta (i think)

8bb8dfe

xTVaser force-pushed the v/arm-work-2 branch from f71b180 to a94e731 Compare January 4, 2024 02:03

goalc: resolve x86 compilation error

3767847

xTVaser force-pushed the v/arm-work-2 branch from a94e731 to 3767847 Compare January 4, 2024 02:08

cmake: coerce cmake to work

92ee892

xTVaser force-pushed the v/arm-work-2 branch from b27f8cf to 92ee892 Compare January 4, 2024 02:58

xTVaser changed the title ~~goalc: Some more MacOS ARM64 work~~ goalc: Some more macOS ARM64 work Jan 4, 2024

xTVaser marked this pull request as ready for review January 6, 2024 22:19

xTVaser requested a review from water111 January 9, 2024 21:51

water111 reviewed Jan 21, 2024

View reviewed changes

xTVaser marked this pull request as draft February 17, 2024 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

goalc: Some more macOS ARM64 work #3290

goalc: Some more macOS ARM64 work #3290

xTVaser commented Jan 3, 2024 •

edited

Loading

water111 left a comment

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

water111 Jan 21, 2024

		if (cc.return_reg && cc.return_reg->is_xmm()) {
		if (cc.return_reg && cc.return_reg->is_128bit_simd()) {

		InstructionInfo(const emitter::Instruction& _instruction, Kind _kind, int _ir_idx)
		InstructionInfo(const emitter::Instruction _instruction, Kind _kind, int _ir_idx)

goalc: Some more macOS ARM64 work #3290

Are you sure you want to change the base?

goalc: Some more macOS ARM64 work #3290

Conversation

xTVaser commented Jan 3, 2024 • edited Loading

water111 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xTVaser commented Jan 3, 2024 •

edited

Loading