Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid read on 32-bits introduced by tupocolypse #11003

Closed
garrison opened this issue Apr 25, 2015 · 32 comments
Closed

Invalid read on 32-bits introduced by tupocolypse #11003

garrison opened this issue Apr 25, 2015 · 32 comments
Labels
bug Indicates an unexpected problem or unintended behavior regression Regression in behavior compared to a previous version system:32-bit Affects only 32-bit systems

Comments

@garrison
Copy link
Member

Since the merging of the tuple type redesign (#10380), valgrind (with -DMEMDEBUG) frequently reports invalid reads on 32-bits.

Here's a minimal test case: complex([1 2; 3 4], [5 6; 7 8]).

$ valgrind --smc-check=all-non-file ../julia -e "complex([1 2; 3 4], [5 6; 7 8])"
==24698== Memcheck, a memory error detector
==24698== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==24698== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==24698== Command: ../julia -e complex([1\ 2;\ 3\ 4],\ [5\ 6;\ 7\ 8])
==24698== 
==24698== Invalid read of size 4
==24698==    at 0x6098806: ???
==24698==    by 0x60986F4: ???
==24698==    by 0x40AAF13: jl_apply (julia.h:1276)
==24698==    by 0x40AED9E: jl_trampoline (builtins.c:1026)
==24698==    by 0x409FAC1: jl_apply (julia.h:1276)
==24698==    by 0x40A5433: jl_apply_generic (gf.c:1720)
==24698==    by 0x4135EC3: jl_apply (julia.h:1276)
==24698==    by 0x4136372: do_call (interpreter.c:63)
==24698==    by 0x4136F6A: eval (interpreter.c:210)
==24698==    by 0x413602C: jl_interpret_toplevel_expr (interpreter.c:25)
==24698==    by 0x415044D: jl_toplevel_eval_flex (toplevel.c:505)
==24698==    by 0x4150546: jl_toplevel_eval (toplevel.c:528)
==24698==  Address 0xfeb24580 is on thread 1's stack
==24698==  16 bytes below stack pointer
@JeffBezanson JeffBezanson added bug Indicates an unexpected problem or unintended behavior regression Regression in behavior compared to a previous version system:32-bit Affects only 32-bit systems labels Apr 25, 2015
@garrison
Copy link
Member Author

garrison commented May 1, 2015

I'm not sure if this will provide any clue, but I get the invalid read the first two times this is called, but never on any calls after the second time.

$ valgrind -q --smc-check=all-non-file ./julia -q
julia> complex([1 2; 3 4], [5 6; 7 8])
==29011== Invalid read of size 4
==29011==    at 0x60A5B16: ???
==29011==    by 0x60A5A04: ???
==29011==    by 0x40AAF13: jl_apply (julia.h:1276)
==29011==    by 0x40AED9E: jl_trampoline (builtins.c:1026)
==29011==    by 0x409FAC1: jl_apply (julia.h:1276)
==29011==    by 0x40A5433: jl_apply_generic (gf.c:1720)
==29011==    by 0x4135DCF: jl_apply (julia.h:1276)
==29011==    by 0x413627E: do_call (interpreter.c:63)
==29011==    by 0x4136E76: eval (interpreter.c:210)
==29011==    by 0x4135F38: jl_interpret_toplevel_expr (interpreter.c:25)
==29011==    by 0x4150359: jl_toplevel_eval_flex (toplevel.c:505)
==29011==    by 0x4150452: jl_toplevel_eval (toplevel.c:528)
==29011==  Address 0xfec50660 is on thread 1's stack
==29011==  16 bytes below stack pointer
==29011== 
2x2 Array{Complex{Int32},2}:
 1+5im  2+6im
 3+7im  4+8im

julia> complex([1 2; 3 4], [5 6; 7 8])
==29011== Invalid read of size 4
==29011==    at 0x60A5B16: ???
==29011==    by 0x60A5A04: ???
==29011==    by 0x409FAC1: jl_apply (julia.h:1276)
==29011==    by 0x40A5333: jl_apply_generic (gf.c:1696)
==29011==    by 0x4135DCF: jl_apply (julia.h:1276)
==29011==    by 0x413627E: do_call (interpreter.c:63)
==29011==    by 0x4136E76: eval (interpreter.c:210)
==29011==    by 0x4135F38: jl_interpret_toplevel_expr (interpreter.c:25)
==29011==    by 0x4150359: jl_toplevel_eval_flex (toplevel.c:505)
==29011==    by 0x4150452: jl_toplevel_eval (toplevel.c:528)
==29011==    by 0x40AD146: jl_toplevel_eval_in (builtins.c:536)
==29011==    by 0x40AD32E: jl_f_top_eval (builtins.c:565)
==29011==  Address 0xfec50760 is on thread 1's stack
==29011==  16 bytes below stack pointer
==29011== 
2x2 Array{Complex{Int32},2}:
 1+5im  2+6im
 3+7im  4+8im

julia> complex([1 2; 3 4], [5 6; 7 8])
2x2 Array{Complex{Int32},2}:
 1+5im  2+6im
 3+7im  4+8im

julia> complex([1 2; 3 4], [5 6; 7 8])
2x2 Array{Complex{Int32},2}:
 1+5im  2+6im
 3+7im  4+8im

This seems to be a minimal test case, too -- 1D arrays don't do it, and passing an integer as one argument to complex and a matrix as the other argument doesn't cause the warning either.

@garrison
Copy link
Member Author

garrison commented May 1, 2015

Comparing two matrices also gives the invalid read. And again, the warning happens precisely twice if the comparison is called repeatedly.

julia> a = [1 2; 3 4]
2x2 Array{Int32,2}:
 1  2
 3  4

julia> a == a
==17142== Invalid read of size 4
==17142==    at 0x60AF156: ???
==17142==    by 0x60AF04E: ???
==17142==    by 0x60AEFD9: ???
==17142==    by 0x40AAF13: jl_apply (julia.h:1276)
==17142==    by 0x40AED9E: jl_trampoline (builtins.c:1026)
==17142==    by 0x409FAC1: jl_apply (julia.h:1276)
==17142==    by 0x40A5433: jl_apply_generic (gf.c:1720)
==17142==    by 0x4135DCF: jl_apply (julia.h:1276)
==17142==    by 0x413627E: do_call (interpreter.c:63)
==17142==    by 0x4136E76: eval (interpreter.c:210)
==17142==    by 0x4135F38: jl_interpret_toplevel_expr (interpreter.c:25)
==17142==    by 0x4150359: jl_toplevel_eval_flex (toplevel.c:505)
==17142==  Address 0xfeae46f0 is on thread 1's stack
==17142==  16 bytes below stack pointer
==17142== 
true

I am able to get this warning up to two times, and then if I run the complex minimal test case (given above) in the same REPL session, I can get up to two additional warnings from it.

@yuyichao
Copy link
Contributor

yuyichao commented May 9, 2015

Got a 32-bit container and got vgdb working.... here's the backtrace.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x0801017b in julia_==_43936 (t1=<optimized out>, t2=<optimized out>)
    at tuple.jl:80
80      tuple.jl: No such file or directory.
(gdb) bt
#0  0x0801017b in julia_==_43936 (t1=<optimized out>, t2=<optimized out>)
    at tuple.jl:80
#1  0x08010075 in julia_complex_43935 (A=0xfeffd23c, B=0x2) at array.jl:871
#2  0x040ca018 in jl_apply (f=0x9f16ac0, args=0xfeffd23c, nargs=2)
    at julia.h:1280
#3  0x040cd9e5 in jl_trampoline (F=0x9f16ac0, args=0xfeffd23c, nargs=2)
    at builtins.c:961
#4  0x040bee5d in jl_apply (f=0x9f16ac0, args=0xfeffd23c, nargs=2)
    at julia.h:1280
#5  0x040c47eb in jl_apply_generic (F=0x9b441d0, args=0xfeffd23c, nargs=2)
    at gf.c:1767
#6  0x04162d66 in jl_apply (f=0x9b441d0, args=0xfeffd23c, nargs=2)
    at julia.h:1280
#7  0x041631de in do_call (f=0x9b441d0, args=0x9eea434, nargs=2, eval0=0x0, 
    locals=0x0, nl=0, ngensym=0) at interpreter.c:65
#8  0x04163d81 in eval (e=0x9ee7ec0, locals=0x0, nl=0, ngensym=0)
    at interpreter.c:212
#9  0x04162ebf in jl_interpret_toplevel_expr (e=0x9ee7ec0) at interpreter.c:27
#10 0x0417cf28 in jl_toplevel_eval_flex (e=0x9ee7e50, fast=1) at toplevel.c:507
#11 0x0417d010 in jl_toplevel_eval (v=0x9ee7e50) at toplevel.c:530
#12 0x040cc127 in jl_toplevel_eval_in (m=0x8884010, ex=0x9ee7e50)
    at builtins.c:538
#13 0x040cc2f5 in jl_f_top_eval (F=0x0, args=0xfeffd81c, nargs=2)
    at builtins.c:567
#14 0x05fa815e in julia_process_options_41937 (opts=0xfeffd988, args=0x2)
    at client.jl:288
#15 0x05fa7088 in julia__start_41936 () at client.jl:409
#16 0x05fa7a96 in jlcall.start_41936 ()
   from /build/julia-git/src/julia/usr/lib/julia/sys.so
#17 0x040ca018 in jl_apply (f=0x9b45ac0, args=0x0, nargs=0) at julia.h:1280
#18 0x040cd9e5 in jl_trampoline (F=0x9b45ac0, args=0x0, nargs=0)
    at builtins.c:961
#19 0x040bee5d in jl_apply (f=0x9b45ac0, args=0x0, nargs=0) at julia.h:1280
#20 0x040c4711 in jl_apply_generic (F=0x9b45a60, args=0x0, nargs=0) at gf.c:1743
#21 0x08048f42 in jl_apply (f=0x9b45a60, args=0x0, nargs=0)
    at ../src/julia.h:1280
#22 0x08049e2b in true_main (argc=0, argv=0xfeffdd10) at repl.c:441
#23 0x0804a02a in main (argc=0, argv=0xfeffdd10) at repl.c:494

@yuyichao
Copy link
Contributor

yuyichao commented May 9, 2015

@Keno This is what I get following your advice.

declare i1 @"julia_==_43936"([2 x i32], [2 x i32])

@yuyichao
Copy link
Contributor

yuyichao commented May 9, 2015

And the command on the gdb side

(gdb) p jl_function_ptr_by_llvm_name("julia_==_43936")
$2 = (void *) 0x28ea62a0
(gdb) p jl_dump_llvm_value(jl_function_ptr_by_llvm_name("julia_==_43936"))
$3 = void
(gdb) p jl_function_ptr_by_llvm_name("julia_complex_43935")
$4 = (void *) 0x28b19990
(gdb) p jl_dump_llvm_value(jl_function_ptr_by_llvm_name("julia_complex_43935"))
$5 = void

@yuyichao
Copy link
Contributor

yuyichao commented May 9, 2015

Also, jl_(jl_dump_function_asm(jl_function_ptr_by_llvm_name("julia_complex_43935"))) prints the assembly but p jl_(jl_dump_function_ir(jl_function_ptr_by_llvm_name("julia_complex_43935"), 0)) still only print the function name...

@Keno
Copy link
Member

Keno commented May 9, 2015

I assume that function is JITted? If so, you may have to enable KEEP_BODIES in options.h to see the assembly.

@yuyichao
Copy link
Contributor

yuyichao commented May 9, 2015

Actually running from gdb directly works (might has sth to do with where it is called).....

llvmir

@yuyichao
Copy link
Contributor

yuyichao commented May 9, 2015

Result from jl_dump_llvm_value
Not sure if it makes a difference...

@yuyichao
Copy link
Contributor

yuyichao commented May 9, 2015

And yes it is JITed (otherwise trying to print a NULL pointer will segfault...)

AFAICT this function shouldn't have any allocation and/or gc (didn't see any call to *alloc* functions and it doesn't do anything to jl_pgcstack either) so this should not be GC related?

@Keno
Copy link
Member

Keno commented May 10, 2015

I wouldn't be surprised if something got miscompiled. Can you post the disassembly as well? I'll have a look tomorrow evening.

@yuyichao
Copy link
Contributor

I've just closed the container. Will post that later.

@yuyichao
Copy link
Contributor

@Keno

These should be all what you want (you probably don't need all of them but it will take much longer to get them again...)
Assembly from gdb
LLVM IR from jl_dump_llvm_value
Assembly from jl_dump_function_asm
LLVM IR from jl_dump_function_ir(..., 0)
LLVM IR from jl_dump_function_ir(..., 1)
Backtrace of the invalid read
Registers dump

All of these are obtained in the same session so the addresses in the backtrace/dump points directly into the disassemble from gdb.

@yuyichao
Copy link
Contributor

valgrind also reports a number of "Conditional jump or move depends on uninitialized value(s)" in llvm or llvm related paths
e.g.
(When I try to print the asm using llvm)

==181== Conditional jump or move depends on uninitialised value(s)
==181==    at 0x415E9E6: jl_dump_asm_internal (disasm.cpp:470)                    
==181==    by 0x40E9964: jl_dump_function_asm (codegen.cpp:1055)                  
==181==    by 0xFE97102E: ???                                                     
==181==    by 0x74F8F8F: ???                                                      
==181==    by 0x8A6B1007: ???                                                     
==181==                                                                           
==181== (action on error) vgdb me ...                                             
==181== Continuing ...                                                            
==181== Conditional jump or move depends on uninitialised value(s)                
==181==    at 0x415EA2C: jl_dump_asm_internal (disasm.cpp:470)                    
==181==    by 0x40E9964: jl_dump_function_asm (codegen.cpp:1055)                  
==181==    by 0xFE97102E: ???                                                     
==181==    by 0x74F8F8F: ???                                                      
==181==    by 0x8A6B1007: ???                                                     
==181==                                                                           
==181== (action on error) vgdb me ...                                             

and (the first warning from valgrind)

==181== Conditional jump or move depends on uninitialised value(s)
==181==    at 0x45605E8: llvm::DwarfCompileUnit::addRange(llvm::RangeSpan) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x45335C7: llvm::DwarfDebug::endFunction(llvm::MachineFunction const*) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x451C36B: llvm::AsmPrinter::EmitFunctionBody() (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x4285A9E: llvm::X86AsmPrinter::runOnMachineFunction(llvm::MachineFunction&) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x46100A7: llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x4C0D43E: llvm::FPPassManager::runOnFunction(llvm::Function&) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x4C0D694: llvm::FPPassManager::runOnModule(llvm::Module&) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x4C0D9B8: llvm::legacy::PassManagerImpl::run(llvm::Module&) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x4C0DB7E: llvm::legacy::PassManager::run(llvm::Module&) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x458606D: llvm::MCJIT::emitObject(llvm::Module*) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x45865EA: llvm::MCJIT::generateCodeForModule(llvm::Module*) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181==    by 0x458437B: llvm::MCJIT::getSymbolAddress(std::string const&, bool) (in /build/julia-git/src/julia/usr/lib/libjulia-debug.so)
==181== 
==181== (action on error) vgdb me ... 

Not sure if it matters.
IIRC it also happens on my 64bit system as well although it doesn't report any invalid read in 64bit.

llvm is stock 3.6.0-5 from ArchLinux repo (both 32 and 64 bit).

On 32bit, julia is compiled with MARCH=i686

@garrison
Copy link
Member Author

@yuyichao that's bug #10806 (at least the latter part of it is).

@Keno
Copy link
Member

Keno commented May 11, 2015

Hmm, the problem is that one of the SSA variables gets spilled to a stack slot and then stack restore moves the stack pointer above it, causing valgrind to think we're accessing in invalid memory. I'm not sure if it's legal to carry SSA values across stackrestore, but if it is, I don't think there's anything we can do.

@Keno
Copy link
Member

Keno commented May 11, 2015

(Or more precisely whether it's legal to use SSA values declared after stacksave).

@garrison
Copy link
Member Author

Does this show up only on 32 bits simply because there are fewer registers there to work with?

@Keno
Copy link
Member

Keno commented May 11, 2015

Yes. I'm still debating whether there is actually something that needs to be fixed. Will do some experiments.

@yuyichao
Copy link
Contributor

@Keno Correct me if I'm wrong. Does this mean that if a signal handler kicks in at the right time, it can corrupt the stack variable?

@vtjnash vtjnash mentioned this issue May 21, 2015
19 tasks
@yuyichao
Copy link
Contributor

@Keno Any update.

@Keno
Copy link
Member

Keno commented May 22, 2015

I inquired with the LLVM folks and this may actually be a legitimate bug. I haven't gotten around to investigating further though.

vtjnash added a commit that referenced this issue Jun 9, 2015
fix #11187, fix #11450, fix #11026, ref #10525, fix #11003
TODO: confirm all of those numbers were fixed
TODO: ensure the lazy-loaded objects have gc-roots
TODO: re-enable VectorType objects, so small objects still end up in
registers in the calling convention
TODO: allow moving pointers sometimes rather than copying
TODO: teach the GC how it can re-use an existing pointer as a box

this also changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue
vtjnash added a commit that referenced this issue Jun 9, 2015
fix #11187, fix #11450, fix #11026, ref #10525, fix #11003
TODO: confirm all of those numbers were fixed
TODO: ensure the lazy-loaded objects have gc-roots
TODO: re-enable VectorType objects, so small objects still end up in
registers in the calling convention
TODO: allow moving pointers sometimes rather than copying
TODO: teach the GC how it can re-use an existing pointer as a box

this also changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue

this additionally fixes some undefined behavior from changing
a variable's size through a alloca-cast instead of zext/sext/trunc
vtjnash added a commit that referenced this issue Jun 9, 2015
fix #11187, fix #11450, fix #11026, ref #10525, fix #11003
TODO: confirm all of those numbers were fixed
TODO: ensure the lazy-loaded objects have gc-roots
TODO: re-enable VectorType objects, so small objects still end up in
registers in the calling convention
TODO: allow moving pointers sometimes rather than copying
TODO: teach the GC how it can re-use an existing pointer as a box

this also changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue

this additionally fixes some undefined behavior from changing
a variable's size through a alloca-cast instead of zext/sext/trunc
vtjnash added a commit that referenced this issue Jun 11, 2015
fix #11187 (pass struct and tuple objects by stack pointer)
fix #11450 (ccall emission was frobbing the stack)
likely may fix #11026 and may fix #11003 (ref #10525) invalid stack-read on 32-bit

this additionally changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue

this additionally fixes some undefined behavior from changing
a variable's size through a alloca-cast instead of zext/sext/trunc

this additionally prepares for turning back on allocating tuples as vectors,
since the gc now guarantees 16-byte alignment

future work this makes possible:
 - create a function to replace the jlallocobj_func+init_bits_value call pair (to reduce codegen pressure)
 - allow moving pointers sometimes rather than always copying immutable data
 - teach the GC how it can re-use an existing pointer as a box
vtjnash added a commit that referenced this issue Jun 12, 2015
fix #11187 (pass struct and tuple objects by stack pointer)
fix #11450 (ccall emission was frobbing the stack)
likely may fix #11026 and may fix #11003 (ref #10525) invalid stack-read on 32-bit

this additionally changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue

this additionally fixes some undefined behavior from changing
a variable's size through a alloca-cast instead of zext/sext/trunc

this additionally prepares for turning back on allocating tuples as vectors,
since the gc now guarantees 16-byte alignment

future work this makes possible:
 - create a function to replace the jlallocobj_func+init_bits_value call pair (to reduce codegen pressure)
 - allow moving pointers sometimes rather than always copying immutable data
 - teach the GC how it can re-use an existing pointer as a box
vtjnash added a commit that referenced this issue Jun 16, 2015
fix #11187 (pass struct and tuple objects by stack pointer)
fix #11450 (ccall emission was frobbing the stack)
likely may fix #11026 and may fix #11003 (ref #10525) invalid stack-read on 32-bit

this additionally changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue

this additionally fixes some undefined behavior from changing
a variable's size through a alloca-cast instead of zext/sext/trunc

this additionally prepares for turning back on allocating tuples as vectors,
since the gc now guarantees 16-byte alignment

future work this makes possible:
 - create a function to replace the jlallocobj_func+init_bits_value call pair (to reduce codegen pressure)
 - allow moving pointers sometimes rather than always copying immutable data
 - teach the GC how it can re-use an existing pointer as a box
@garrison
Copy link
Member Author

Thanks @vtjnash for (quite possibly) fixing this!

I'm trying to confirm that this actually fixes the invalid reads but am unable now to build 32-bit julia on my 64-bit machine. My Make.user is:

CFLAGS = -DMEMDEBUG
ARCH = i686
JULIA_CPU_TARGET = pentium4

and libgit2 fails during build with

[ 22%] Building C object CMakeFiles/git2.dir/deps/http-parser/http_parser.c.o
Linking C shared library libgit2.so
/usr/lib/x86_64-linux-gnu/libssl.so: error adding symbols: File in wrong format
collect2: error: ld returned 1 exit status
CMakeFiles/git2.dir/build.make:3213: recipe for target 'libgit2.so.0.22.2' failed

I believe this is due to a change in the build system (not a change on my computer), but I won't have time to fully investigate for a few weeks due to vacation. The file /usr/lib/i386-linux-gnu/libssl.so exists; it just isn't being used for whatever reason.

@vtjnash
Copy link
Member

vtjnash commented Jun 16, 2015

cmake handles cross-compilation particularly badly. I usually just set USE_SYSTEM_LIBGIT2=1 to bypass this particular error (since libgit2 isn't currently used anywhere anyways)

@garrison
Copy link
Member Author

Thanks @vtjnash, everything seems to work correctly now.

@vtjnash
Copy link
Member

vtjnash commented Jun 16, 2015

just with the build, or valgrind too?

@garrison
Copy link
Member Author

With valgrind too.

@vtjnash
Copy link
Member

vtjnash commented Jun 16, 2015

Thanks!

@tkelman
Copy link
Contributor

tkelman commented Jun 16, 2015

cmake handles cross-compilation particularly badly.

This particular issue is largely debian's fault for not allowing you to install i386 libraries without uninstalling large numbers of x86_64 libraries first.

edit: hm, @garrison might not be using Debian if he was able to get openssl:i386 installed without tons of conflicts, but cmake's FIND_PACKAGE(OpenSSL) just hands off to pkg-config which doesn't actually check that it's able to link. A few extra lines in libgit2's cmake configuration to check that would probably work.

Also to nitpick on terminology, this isn't a cross-compile, it's multiarch. If we were using a true cross-compiler prefix i386-linux-gnu-gcc with a separate sysroot (and i386-linux-gnu-pkg-config) this would be fine. If libgit2 were using autotools it wouldn't handle this problem any better.

@garrison
Copy link
Member Author

@tkelman I am using Debian jessie, and overall I've had very few issues with multiarch once I figured out the right settings for Make.user. (I did have some issues when I tried to use an older gcc, though, but those were solved once I installed gfortran-4.8-multilib.)

@tkelman
Copy link
Contributor

tkelman commented Jun 17, 2015

The issue here lies with pkg-config not knowing anything about multiarch, https://blogs.gnome.org/jamesh/2005/07/04/pkg-config-vs-cross-compile-and-multi-arch/

I think we'd have to come up with a dummy test library and use check_c_source_compiles in libgit2's cmakelists to avoid linking to openssl in this case. Or fix pkg-config - cmake isn't at fault here.

#11021

fcard pushed a commit to fcard/julia that referenced this issue Jul 8, 2015
fix JuliaLang#11187 (pass struct and tuple objects by stack pointer)
fix JuliaLang#11450 (ccall emission was frobbing the stack)
likely may fix JuliaLang#11026 and may fix JuliaLang#11003 (ref JuliaLang#10525) invalid stack-read on 32-bit

this additionally changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue

this additionally fixes some undefined behavior from changing
a variable's size through a alloca-cast instead of zext/sext/trunc

this additionally prepares for turning back on allocating tuples as vectors,
since the gc now guarantees 16-byte alignment

future work this makes possible:
 - create a function to replace the jlallocobj_func+init_bits_value call pair (to reduce codegen pressure)
 - allow moving pointers sometimes rather than always copying immutable data
 - teach the GC how it can re-use an existing pointer as a box
fcard pushed a commit to fcard/julia that referenced this issue Jul 8, 2015
fix JuliaLang#11187 (pass struct and tuple objects by stack pointer)
fix JuliaLang#11450 (ccall emission was frobbing the stack)
likely may fix JuliaLang#11026 and may fix JuliaLang#11003 (ref JuliaLang#10525) invalid stack-read on 32-bit

this additionally changes the julia specSig calling convention to pass
non-primitive types by pointer instead of by-value

this additionally fixes a bug in gen_cfunction that could be exposed by
turning off specSig

this additionally moves the alloca calls in ccall (and other places) to
the entry BasicBlock in the function, ensuring that llvm detects them as
static allocations and moves them into the function prologue

this additionally fixes some undefined behavior from changing
a variable's size through a alloca-cast instead of zext/sext/trunc

this additionally prepares for turning back on allocating tuples as vectors,
since the gc now guarantees 16-byte alignment

future work this makes possible:
 - create a function to replace the jlallocobj_func+init_bits_value call pair (to reduce codegen pressure)
 - allow moving pointers sometimes rather than always copying immutable data
 - teach the GC how it can re-use an existing pointer as a box
@tkelman
Copy link
Contributor

tkelman commented Jul 10, 2015

@garrison since this is closed, we should check off the box for this issue in the list at #9336, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior regression Regression in behavior compared to a previous version system:32-bit Affects only 32-bit systems
Projects
None yet
Development

No branches or pull requests

6 participants