Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macos Intel crashes on calling linalg.mul #3762

Open
powerc9000 opened this issue Jun 15, 2024 · 1 comment
Open

Macos Intel crashes on calling linalg.mul #3762

powerc9000 opened this issue Jun 15, 2024 · 1 comment

Comments

@powerc9000
Copy link
Contributor

powerc9000 commented Jun 15, 2024

Context

Odin: dev-2024-06:02f11dfde
OS: macOS Sonoma 14.4.1 (build: 23F79, kernel: 23.4.0)
CPU: Intel(R) Core(TM) i5-8500B CPU @ 3.00GHz
RAM: 16384 MiB
Backend: LLVM 18.1.6

Calling linalg.mul with a Matrix2 crashes with a EXC_I386_GPFLT (General fault).
Only in -o:non or -debug o:speed does not share the issue.

from the discord it was believed this is because of bad code gen causing a bad stack pointer.

Expected Behavior

Dont crash.

Current Behavior

Crash

Failure Information (for bugs)

Disassembly

main`linalg.matrix_mul_vector-8549:
    0x100007260 <+0>:   movaps %xmm1, -0x58(%rsp)
    0x100007265 <+5>:   movaps %xmm0, -0x48(%rsp)
    0x10000726a <+10>:  movaps %xmm2, -0x38(%rsp)
    0x10000726f <+15>:  movaps -0x38(%rsp), %xmm0
    0x100007274 <+20>:  movaps -0x58(%rsp), %xmm1
    0x100007279 <+25>:  movaps -0x48(%rsp), %xmm2
    0x10000727e <+30>:  movlpd %xmm2, -0x10(%rsp)
    0x100007284 <+36>:  movlpd %xmm1, -0x8(%rsp)
    0x10000728a <+42>:  movlpd %xmm0, -0x18(%rsp)
    0x100007290 <+48>:  movq   $0x0, -0x20(%rsp)
->  0x100007299 <+57>:  movaps -0x10(%rsp), %xmm1
    0x10000729e <+62>:  movsd  -0x8(%rsp), %xmm0
    0x1000072a4 <+68>:  movss  -0x18(%rsp), %xmm3
    0x1000072aa <+74>:  movss  -0x14(%rsp), %xmm2
    0x1000072b0 <+80>:  movsldup %xmm3, %xmm3 ; xmm3 = xmm3[0,0,2,2] 
    0x1000072b4 <+84>:  movsldup %xmm2, %xmm2 ; xmm2 = xmm2[0,0,2,2] 
    0x1000072b8 <+88>:  mulps  %xmm3, %xmm1
    0x1000072bb <+91>:  mulps  %xmm2, %xmm0
    0x1000072be <+94>:  addps  %xmm1, %xmm0
    0x1000072c1 <+97>:  movq   $0x0, -0x28(%rsp)
    0x1000072ca <+106>: movlpd %xmm0, -0x28(%rsp)
    0x1000072d0 <+112>: movss  -0x28(%rsp), %xmm0
    0x1000072d6 <+118>: movss  -0x24(%rsp), %xmm1
    0x1000072dc <+124>: movss  %xmm1, -0x1c(%rsp)
    0x1000072e2 <+130>: movss  %xmm0, -0x20(%rsp)
    0x1000072e8 <+136>: movsd  -0x28(%rsp), %xmm0
    0x1000072ee <+142>: retq   

registers

General Purpose Registers:
       rax = 0x41a0000041a00000
       rbx = 0x0000000100601b90
       rcx = 0x00000001000060a0  main`runtime.default_logger_proc at core.odin:653
       rdx = 0x00007ff80db1aaf0  libsystem_m.dylib`_FE_DFL_DISABLE_SSE_DENORMS_ENV + 7552
       rdi = 0x00007ff7bfefed70
       rsi = 0x00007ff7bfefecc8
       rbp = 0x00007ff7bfefece0
       rsp = 0x00007ff7bfefec78
        r8 = 0x0000000100007bdb  "/odin/base/runtime/entry_unix.odin"
        r9 = 0x0000000000000001
       r10 = 0x0000000000000000
       r11 = 0x0000000000000088
       r12 = 0x00007ff7bfefee20
       r13 = 0x0000000000000000
       r14 = 0x0000000100007080  main`main at entry_unix.odin:50
       r15 = 0x00007ff7bfefefa0
       rip = 0x0000000100007299  main`linalg.matrix_mul_vector-8549 + 57 at general.odin:217:2
    rflags = 0x0000000000010246
        cs = 0x000000000000002b
        fs = 0x0000000000000000
        gs = 0x0000000000000000

Steps to Reproduce

Sample program

package main

import "core:math/linalg"

main :: proc() {
    v1 : linalg.Vector2f32 = {1, 2}
    rot := linalg.matrix2_rotate(f32(20))

    res := linalg.mul(rot, v1)
}
@laytan
Copy link
Sponsor Collaborator

laytan commented Jun 27, 2024

Looks like an alignment issue with the amd64 sysv ABI, you can see in the following snippet that it is allocating the parameter on align 4 and then loading it as if it is align 16:

define internal void @main.foos(<{ <2 x float>, <2 x float> }> %0, ptr noalias nocapture nonnull %__.context_ptr) {
decls:
  %1 = alloca [4 x float], align 4
  %2 = alloca [4 x float], align 32
  %b = alloca [4 x float], align 32
  br label %entry

entry:                                            ; preds = %decls
  store <{ <2 x float>, <2 x float> }> %0, ptr %1, align 1
  %3 = load <4 x float>, ptr %1, align 16
  %4 = load <4 x float>, ptr %1, align 16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants