Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reinterpret arrays segfaults on CI #52118

Closed
vtjnash opened this issue Nov 10, 2023 · 2 comments · Fixed by #52182
Closed

reinterpret arrays segfaults on CI #52118

vtjnash opened this issue Nov 10, 2023 · 2 comments · Fixed by #52182
Labels
ci Continuous integration

Comments

@vtjnash
Copy link
Member

vtjnash commented Nov 10, 2023

Test MWE:

using Test, InteractiveUtils
struct TSlow{T,N} <: AbstractArray{T,N}
    data::Dict{NTuple{N,Int}, T}
    dims::NTuple{N,Int}
end
TSlow(::Type{T}, dims::NTuple{N,Int}) where {T,N} = TSlow{T,N}(Dict{NTuple{N,Int}, T}(), dims)
TSlow(X::AbstractArray{T,N}) where {T,N} = TSlow{T,N}(X)
TSlow{T,N}(X::AbstractArray     ) where {T,N  } = begin
    A = TSlow(T, size(X))
    for I in CartesianIndices(X)
        A[Tuple(I)...] = X[Tuple(I)...]
    end
    A
end

Base.size(A::TSlow) = A.dims
Base.getindex(A::TSlow{T,N}, i::Vararg{Int,N}) where {T,N} = get(A.data, i, zero(T))
Base.setindex!(A::TSlow{T,N}, v, i::Vararg{Int,N}) where {T,N} = (A.data[i] = v)

B = Complex{Int64}[5+6im, 7+8im, 9+10im]
function f(Bc)
    reinterpret(NTuple{3, Int64}, Bc)[2] = (4,5,6)
    @test Bc == Complex{Int64}[5+6im, 7+4im, 5+6im]
    B2 = reinterpret(NTuple{3, Int64}, Bc)
    @test setindex!(B2, (1,2,3), 1) === B2
end
versioninfo()
f(TSlow(B))

Result:

$ usr/bin/julia.exe -e "include(\"C:\\\\Users\\\\MYJ\\\\.envpack\\\\temp.jl\")"
Julia Version 1.11.0-DEV.862
Commit 8f8b9cacdf (2023-11-08 15:20 UTC)
Platform Info:
  OS: Windows (i686-w64-mingw32)
  CPU: 16 × 12th Gen Intel(R) Core(TM) i7-12650H
  WORD_SIZE: 32
  LLVM: libLLVM-15.0.7 (ORCJIT, alderlake)
  Threads: 8 on 12 virtual cores
Environment:
  JULIA_CPU_THREADS = 12
  JULIA_NUM_THREADS = 6

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x27a9c4f3 -- memcpy at .\cmem.jl:14 [inlined]
_setindex_ra! at .\reinterpretarray.jl:584 [inlined]
setindex! at .\reinterpretarray.jl:505 [inlined]
macro expansion at D:\Documents\GitHub\julia\julia-mingw-w32\usr\share\julia\stdlib\v1.11\Test\src\Test.jl:676 [inlined]
f at C:\Users\MYJ\.envpack\temp.jl:27
in expression starting at C:\Users\MYJ\.envpack\temp.jl:30
memcpy at .\cmem.jl:14 [inlined]
_setindex_ra! at .\reinterpretarray.jl:584 [inlined]
setindex! at .\reinterpretarray.jl:505 [inlined]
macro expansion at D:\Documents\GitHub\julia\julia-mingw-w32\usr\share\julia\stdlib\v1.11\Test\src\Test.jl:676 [inlined]
f at C:\Users\MYJ\.envpack\temp.jl:27
unknown function (ip: 27a9cb31)
_jl_invoke at /cygdrive/d/Documents/GitHub/julia/src\gf.c:2913 [inlined]
ijl_apply_generic at /cygdrive/d/Documents/GitHub/julia/src\gf.c:3095
jl_apply at /cygdrive/d/Documents/GitHub/julia/src\julia.h:2130 [inlined]
do_call at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:126
eval_value at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:223
eval_stmt_value at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:174 [inlined]
eval_body at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:641
jl_interpret_toplevel_thunk at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:795
jl_toplevel_eval_flex at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:938
jl_toplevel_eval_flex at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:881
ijl_toplevel_eval at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:947 [inlined]
ijl_toplevel_eval_in at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:989
eval at .\boot.jl:418 [inlined]
include_string at .\loading.jl:2133
jl_fptr_args at /cygdrive/d/Documents/GitHub/julia/src\gf.c:2555
_jl_invoke at /cygdrive/d/Documents/GitHub/julia/src\gf.c:2894 [inlined]
ijl_apply_generic at /cygdrive/d/Documents/GitHub/julia/src\gf.c:3095
_include at .\loading.jl:2193
include at .\sysimg.jl:38
unknown function (ip: 27a90071)
_jl_invoke at /cygdrive/d/Documents/GitHub/julia/src\gf.c:2913 [inlined]
ijl_apply_generic at /cygdrive/d/Documents/GitHub/julia/src\gf.c:3095
jl_apply at /cygdrive/d/Documents/GitHub/julia/src\julia.h:2130 [inlined]
do_call at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:126
eval_value at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:223
eval_stmt_value at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:174 [inlined]
eval_body at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:641
jl_interpret_toplevel_thunk at /cygdrive/d/Documents/GitHub/julia/src\interpreter.c:795
jl_toplevel_eval_flex at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:938
jl_toplevel_eval_flex at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:881
ijl_toplevel_eval at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:947 [inlined]
ijl_toplevel_eval_in at /cygdrive/d/Documents/GitHub/julia/src\toplevel.c:989
eval at .\boot.jl:418 [inlined]
exec_options at .\client.jl:291
_start at .\client.jl:525
jfptr__start_69683 at D:\Documents\GitHub\julia\julia-mingw-w32\usr\lib\julia\sys.dll (unknown line)
_jl_invoke at /cygdrive/d/Documents/GitHub/julia/src\gf.c:2894 [inlined]
ijl_apply_generic at /cygdrive/d/Documents/GitHub/julia/src\gf.c:3095
jl_apply at /cygdrive/d/Documents/GitHub/julia/src\julia.h:2130 [inlined]
true_main at /cygdrive/d/Documents/GitHub/julia/src\jlapi.c:586
jl_repl_entrypoint at /cygdrive/d/Documents/GitHub/julia/src\jlapi.c:738
jl_load_repl at /cygdrive/d/Documents/GitHub/julia/cli\loader_lib.c:569
mainCRTStartup at /cygdrive/d/Documents/GitHub/julia/cli\loader_exe.c:58
BaseThreadInitThunk at C:\Windows\System32\KERNEL32.DLL (unknown line)
RtlInitializeExceptionChain at C:\Windows\SYSTEM32\ntdll.dll (unknown line)
RtlClearBits at C:\Windows\SYSTEM32\ntdll.dll (unknown line)
Allocations: 1454771 (Pool: 1454720; Big: 51); GC: 4

Originally posted by @N5N3 in #52101 (comment)

@N5N3
Copy link
Member

N5N3 commented Nov 10, 2023

bisect points to #50650

943db02999a3acfb14d55ccb8e90f3404b2bd1b9 is the first bad commit
commit 943db02999a3acfb14d55ccb8e90f3404b2bd1b9
Author: pchintalapudi <[email protected]>
Date:   Thu Jul 27 15:17:04 2023 +0000

    Restructure JIT to have more extension points and more logical control flow (#50650)

    Rather than forking the optimize and compile layers into individual
    layers per optlevel, each optimize and compile layer will switch on the
    requested optlevel of the module, which reduces the complexity of
    tracing a module's path through the JIT. In addition, this lets us move
    some of the `addModule` code to happen post-optimization, which makes
    the optimization pipeline not see literal pointers except those
    generated by codegen.

 src/jitlayers.cpp | 344 +++++++++++++++++++++++++++++++++---------------------
 src/jitlayers.h   |  41 ++-----
 2 files changed, 220 insertions(+), 165 deletions(-)

@vtjnash vtjnash added the ci Continuous integration label Nov 14, 2023
@vtjnash
Copy link
Member Author

vtjnash commented Nov 14, 2023

We segfault on this instruction:

; ││││ @ reinterpretarray.jl:610 within `_setindex_ra!`                                                                                                                                                                                   
; ││││┌ @ cmem.jl:14 within `memcpy`                                                                                 
        vmovaps xmm0, xmmword ptr [".L_j_const#4"]                                                                                                                                                                                        
; ││││└                                                                                                              

Which is declared as:

        .type   ".L_j_const#4",@object          # @"_j_const#4"                                                                                                                                                                           
        .p2align        4                                                                                            
".L_j_const#4":                                                                                                      
        .quad   1                               # 0x1                                                                
        .quad   2                               # 0x2                                                                
        .quad   3                               # 0x3                                                                
        .size   ".L_j_const#4", 24                                                                                   

But at runtime [".L_j_const#4"] has a value of 0x584bd778

Which fails to meet the p2align constraint. This memory (with the wrong alignment) comes from JuliaOJIT::JITPointersT::getSharedBytes, which was introduced in that commit (as code movement from JuliaOJIT::shareStrings).

The original commit is therefore likely 0791aef (#44573), but as with most things, it also used a StringMap before that commit, so I am not sure how this ever worked.

vtjnash added a commit that referenced this issue Nov 15, 2023
…gnment

The llvm optimizations may increase alignment beyond the initial
MAX_ALIGN. This pool's alignment was previously only
`sizeof(struct { atomic<int> RefCount; size_t Length; char Data[]; })`
however, potentially resulting in segfaults at runtime.

Fixes #52118
vtjnash added a commit that referenced this issue Nov 16, 2023
…gnment

The llvm optimizations may increase alignment beyond the initial
MAX_ALIGN. This pool's alignment was previously only
`sizeof(struct { atomic<int> RefCount; size_t Length; char Data[]; })`
however, potentially resulting in segfaults at runtime.

Fixes #52118
vtjnash added a commit that referenced this issue Nov 16, 2023
…gnment

The llvm optimizations may increase alignment beyond the initial
MAX_ALIGN. This pool's alignment was previously only
`sizeof(struct { atomic<int> RefCount; size_t Length; char Data[]; })`
however, potentially resulting in segfaults at runtime.

Fixes #52118
staticfloat pushed a commit that referenced this issue Nov 16, 2023
…gnment

The llvm optimizations may increase alignment beyond the initial
MAX_ALIGN. This pool's alignment was previously only
`sizeof(struct { atomic<int> RefCount; size_t Length; char Data[]; })`
however, potentially resulting in segfaults at runtime.

Fixes #52118

(cherry picked from commit 8cb8a12)
vtjnash added a commit that referenced this issue Nov 16, 2023
…gnment

The llvm optimizations may increase alignment beyond the initial
MAX_ALIGN. This pool's alignment was previously only
`sizeof(struct { atomic<int> RefCount; size_t Length; char Data[]; })`
however, potentially resulting in segfaults at runtime.

Fixes #52118
DilumAluthge pushed a commit that referenced this issue Nov 17, 2023
…gnment (#52182)

The llvm optimizations may increase alignment beyond the initial
MAX_ALIGN. This pool's alignment was previously only `sizeof(struct {
atomic<int> RefCount; size_t Length; char Data[]; })` however,
potentially resulting in segfaults at runtime.

Fixes #52118. Should make CI much happier.
vtjnash added a commit that referenced this issue Nov 28, 2023
…gnment (#52182)

The llvm optimizations may increase alignment beyond the initial
MAX_ALIGN. This pool's alignment was previously only `sizeof(struct {
atomic<int> RefCount; size_t Length; char Data[]; })` however,
potentially resulting in segfaults at runtime.

Fixes #52118. Should make CI much happier.

(cherry picked from commit a65bc9a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants