Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug repl broken with mac llvm-gcc #523

Closed
vtjnash opened this issue Mar 5, 2012 · 14 comments
Closed

debug repl broken with mac llvm-gcc #523

vtjnash opened this issue Mar 5, 2012 · 14 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@vtjnash
Copy link
Sponsor Member

vtjnash commented Mar 5, 2012

On Mac OS 10.7, make debug with default compiler (Apple's llvm-gcc) doesn't work with the command line REPL (readline or basic). However, it does work with scripts and pass all tests.

$ ./julia-debug-basic 
               _
   _       _ _(_)_     |
  (_)     | (_) (_)    |  A fresh approach to technical computing
   _ _   _| |_  __ _   |
  | | | | | | |/ _` |  |  Version 0.0.0-prerelease
  | | |_| | | | (_| |  |  Commit 3d331c5869 (2012-03-04 23:04:14)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

julia> 1;
error: stack overflow
$ ./julia-debug-basic 
               _
   _       _ _(_)_     |
  (_)     | (_) (_)    |  A fresh approach to technical computing
   _ _   _| |_  __ _   |
  | | | | | | |/ _` |  |  Version 0.0.0-prerelease
  | | |_| | | | (_| |  |  Commit 3d331c5869 (2012-03-04 23:04:14)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

julia> a;
error: stack overflow
$ gdb ./julia
...
julia> 1;

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00007fff5fbfed6c
0x00007fff5fbfed6c in ?? ()
(gdb) bt 
#0  0x00007fff5fbfed6c in ?? ()
#1  0x0000000103e1127a in ?? ()
#2  0x0000000103e10c93 in ?? ()
#3  0x00000001000163a4 in jl_apply (f=0x1010f6548, args=0x7fff5fbff130, nargs=1) at julia.h:778
#4  0x0000000100019004 in jl_trampoline (F=0x1010f6548, args=0x7fff5fbff130, nargs=1) at builtins.c:878
#5  0x000000010000e614 in jl_apply (f=0x1010f6548, args=0x7fff5fbff130, nargs=1) at julia.h:778
#6  0x0000000100010f85 in jl_apply_generic (F=0x1010f6388, args=0x7fff5fbff130, nargs=1) at gf.c:1105
#7  0x0000000103e0d383 in ?? ()
#8  0x0000000103e0ce56 in ?? ()
#9  0x00000001000163a4 in jl_apply (f=0x101432808, args=0x7fff5fbff3e8, nargs=1) at julia.h:778
#10 0x0000000100019004 in jl_trampoline (F=0x101432808, args=0x7fff5fbff3e8, nargs=1) at builtins.c:878
#11 0x000000010000e614 in jl_apply (f=0x101432808, args=0x7fff5fbff3e8, nargs=1) at julia.h:778
#12 0x0000000100010f85 in jl_apply_generic (F=0x101432788, args=0x7fff5fbff3e8, nargs=1) at gf.c:1105
#13 0x0000000103dfddf3 in ?? ()
#14 0x0000000103dfd75b in ?? ()
#15 0x00000001000163a4 in jl_apply (f=0x1028e6e68, args=0x0, nargs=0) at julia.h:778
#16 0x0000000100019004 in jl_trampoline (F=0x1028e6e68, args=0x0, nargs=0) at builtins.c:878
#17 0x000000010000e614 in jl_apply (f=0x1028e6e68, args=0x0, nargs=0) at julia.h:778
#18 0x0000000100010f85 in jl_apply_generic (F=0x1028e6b28, args=0x0, nargs=0) at gf.c:1105
#19 0x0000000100001c64 in jl_apply ()
#20 0x0000000100001df9 in true_main ()
#21 0x0000000100072aea in julia_trampoline (argc=0, argv=0x7fff5fbffb70, pmain=0x100001c80 <true_main>) at init.c:199
#22 0x000000010000203d in main ()
(gdb)  

....
$ ../julia all.j 
     * core
     * numbers
     * strings
     * corelib
     * hashing
     * arrayops
     * sparse
     * lapack
     * fft
     * arpack
     * random
     * amos
     * unicode
     * perf

$ git pull
Already up-to-date.
$ git status
# On branch master
nothing to commit (working directory clean)
$ 
@StefanKarpinski
Copy link
Sponsor Member

I can confirm that I'm getting this too. Very strange.

@StefanKarpinski
Copy link
Sponsor Member

It's not very helpful, but this is the backtrace I get in gdb:

julia> 1

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00007fff5fbfe82c
0x00007fff5fbfe82c in ?? ()
(gdb) bt
#0  0x00007fff5fbfe82c in ?? ()
#1  0x000000010260821a in ?? ()
#2  0x0000000102607c33 in ?? ()
#3  0x00000001000160a4 in jl_apply (f=0x101c5ad28, args=0x7fff5fbfebf0, nargs=1) at julia.h:778
#4  0x0000000100018d04 in jl_trampoline (F=0x101c5ad28, args=0x7fff5fbfebf0, nargs=1) at builtins.c:878
#5  0x000000010000e314 in jl_apply (f=0x101c5ad28, args=0x7fff5fbfebf0, nargs=1) at julia.h:778
#6  0x0000000100010c85 in jl_apply_generic (F=0x101c5ab88, args=0x7fff5fbfebf0, nargs=1) at gf.c:1105
#7  0x0000000102604323 in ?? ()
#8  0x0000000102603df6 in ?? ()
#9  0x00000001000160a4 in jl_apply (f=0x101accbe8, args=0x7fff5fbfeea8, nargs=1) at julia.h:778
#10 0x0000000100018d04 in jl_trampoline (F=0x101accbe8, args=0x7fff5fbfeea8, nargs=1) at builtins.c:878
#11 0x000000010000e314 in jl_apply (f=0x101accbe8, args=0x7fff5fbfeea8, nargs=1) at julia.h:778
#12 0x0000000100010c85 in jl_apply_generic (F=0x101accb68, args=0x7fff5fbfeea8, nargs=1) at gf.c:1105
#13 0x00000001025f4d93 in ?? ()
#14 0x00000001025f46fb in ?? ()
#15 0x00000001000160a4 in jl_apply (f=0x101aa7d68, args=0x0, nargs=0) at julia.h:778
#16 0x0000000100018d04 in jl_trampoline (F=0x101aa7d68, args=0x0, nargs=0) at builtins.c:878
#17 0x000000010000e314 in jl_apply (f=0x101aa7d68, args=0x0, nargs=0) at julia.h:778
#18 0x0000000100010c85 in jl_apply_generic (F=0x101aa79e8, args=0x0, nargs=0) at gf.c:1105
#19 0x0000000100001964 in jl_apply ()
#20 0x0000000100001af9 in true_main ()
#21 0x0000000100072b6a in julia_trampoline (argc=0, argv=0x7fff5fbff630, pmain=0x100001980 <true_main>) at init.c:199
#22 0x0000000100001d3d in main ()

@Keno
Copy link
Member

Keno commented Mar 5, 2012

Could this be cause by the same problem as #416? Also, can you try building llvm in debug mode and uncomment the following line in codegen.cpp:

llvm::JITEmitDebugInfo = true;

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 5, 2012

No, complex expressions fail also, I just wanted to show it fails for even the simplest test case. That line gets included automatically on a debug build. Building llvm in debug mode with assertions gives the same result (aside from being painfully slow).

@Keno
Copy link
Member

Keno commented Mar 5, 2012

Does gdb show a better backtrace in debug mode?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 5, 2012

No, it's the same. It's a known issue that gdb can't use the symbolic info for JIT code on Mac.

@Keno
Copy link
Member

Keno commented Mar 5, 2012

Oh, I didn't know that Macs have a different object format. I personally don't use macs and just assumed it would be similar. I'll see if I can replicate it on my Linux machiene.

@Keno
Copy link
Member

Keno commented Mar 6, 2012

@vtjnash Are you using gdb version 7.0 or newer (that's the version that added JIT support)?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 6, 2012

I wasn't. But just tried gdb 7.3, and it seems they haven't added support for JIT on mac yet. Also tried lldb, but it wasn't much help either (aside from finding a different potential, minor bug :). I haven't found a way of replicating it on my linux box.

Addendum: this happens with just hitting enter in the repl with no input. It does not happen with "./julia -b" but does with "./julia -J sys0.ji".

@JeffBezanson
Copy link
Sponsor Member

Is it possible that llvm::JITEmitDebugInfo = true; is causing problems? What about a debug build without that line present?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 8, 2012

That didn't work either. I don't know how this is useful, but by some magic incantation, uncommenting #define JL_TRACE in gf.c can make it go away.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 10, 2012

With the latest code, I'm getting basically the same backtrace, but now I'm almost getting the prompt back too:

               _
   _       _ _(_)_     |
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  A fresh approach to technical computing
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.0.0+1331364586.ra1fc.dirty
 _/ |\__'_|_|_|\__'_|  |  Commit a1fcc12042 (2012-03-10 02:29:46)*
|__/                   |


julia> 1
1

julia> 
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00007fff5fbff12c
0x00007fff5fbff12c in ?? ()
(gdb) bt
#0  0x00007fff5fbff12c in ?? ()
#1  0x0000000103701227 in ?? ()
#2  0x00000001000105c4 in jl_apply (f=0x10131f308, args=0x7fff5fbff4d8, nargs=1) at julia.h:779
#3  0x0000000100013005 in jl_apply_generic (F=0x10131f228, args=0x7fff5fbff4d8, nargs=1) at gf.c:1108
#4  0x0000000103700537 in ?? ()
#5  0x00000001036fef61 in ?? ()
#6  0x0000000100018584 in jl_apply (f=0x101140b28, args=0x7fff5fbff8a0, nargs=1) at julia.h:779
#7  0x000000010001b214 in jl_trampoline (F=0x101140b28, args=0x7fff5fbff8a0, nargs=1) at builtins.c:883
#8  0x00000001000105c4 in jl_apply (f=0x101140b28, args=0x7fff5fbff8a0, nargs=1) at julia.h:779
#9  0x0000000100013005 in jl_apply_generic (F=0x1011409a8, args=0x7fff5fbff8a0, nargs=1) at gf.c:1108
#10 0x00000001036feb25 in ?? ()
#11 0x0000000100018584 in jl_apply (f=0x1028be668, args=0x0, nargs=0) at julia.h:779
#12 0x000000010001b214 in jl_trampoline (F=0x1028be668, args=0x0, nargs=0) at builtins.c:883
#13 0x0000000100075d24 in jl_apply (f=0x1028be668, args=0x0, nargs=0) at julia.h:779
#14 0x0000000100075b83 in start_task (t=0x101392548) at task.c:370
#15 0x00000001000757c6 in switch_stack (t=0x101392548, where=0x101392568) at task.c:185
#16 0x0000000100075813 in jl_switch_stack (t=0x101392548, where=0x101392568) at task.c:195
#17 0x0000000100074d18 in julia_trampoline (argc=0, argv=0x7fff5fbffaf0, pmain=0x100001cd0 <true_main>) at init.c:200
#18 0x000000010000208d in main ()

@vtjnash vtjnash closed this as completed Mar 17, 2012
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 17, 2012

debug build is working again in the REPL. I'm not sure if this ended up getting fixed somewhere, or if is simply hidden.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 31, 2012

I now believe the fault lies in the restore_stack function. It was occurring with a debug build for the uv branch, so I think I tracked it down to the value (t->stackbase-t->ssize) getting written into memory just after finishing the memcpy and just before we try to longjmp. Apparently it happened to pick the eventual stack pointer to overwrite. I'm not sure how. But moving the offset in the comparison of &_x seems to work to give the system enough space to not break:
if ((char*)&_x[64] > (char*)(t->stackbase-t->ssize))

KristofferC added a commit that referenced this issue Jul 27, 2018
KristofferC added a commit that referenced this issue Jul 27, 2018
KristofferC added a commit that referenced this issue Feb 11, 2019
Keno pushed a commit that referenced this issue Oct 9, 2023
dkarrasch pushed a commit that referenced this issue May 8, 2024
Stdlib: SparseArrays
URL: https://github.com/JuliaSparse/SparseArrays.jl.git
Stdlib branch: main
Julia branch: master
Old commit: cb602d7
New commit: a09f90b
Julia version: 1.12.0-DEV
SparseArrays version: 1.12.0
Bump invoked by: @dkarrasch
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaSparse/SparseArrays.jl@cb602d7...a09f90b

```
$ git log --oneline cb602d7..a09f90b
a09f90b Adjust matvec and matmatmul! to new internal LinAlg interface (#519)
3b30333 ci: run aqua test as a standalone ci job (#537)
df0a154 Add versioned Manifest files to .gitignore (#534)
4606755 Extend `copytrito!` for a sparse source (#533)
33fbc75 SparseMatrixCSC constructor with a Tuple of Integers (#523)
08d6ae1 CI: don't run `threads` tests in Windows GHA CI (attempt 2) (#530)
7408e4b Revert "Don't fail CI if codecov upload fails." (#527)
287e406 Bump julia-actions/setup-julia from 1 to 2 (#524)
b5de0da Don't fail CI if codecov upload fails. (#525)
78dde4c cast to Float64 directly instead of using float (#521)
a5e95ec CI: Add Apple Silicon (macOS aarch64) to the CI matrix (#505)
```

Co-authored-by: Dilum Aluthge <[email protected]>
xlxs4 pushed a commit to xlxs4/julia that referenced this issue May 9, 2024
…aLang#54406)

Stdlib: SparseArrays
URL: https://github.com/JuliaSparse/SparseArrays.jl.git
Stdlib branch: main
Julia branch: master
Old commit: cb602d7
New commit: a09f90b
Julia version: 1.12.0-DEV
SparseArrays version: 1.12.0
Bump invoked by: @dkarrasch
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaSparse/SparseArrays.jl@cb602d7...a09f90b

```
$ git log --oneline cb602d7..a09f90b
a09f90b Adjust matvec and matmatmul! to new internal LinAlg interface (JuliaLang#519)
3b30333 ci: run aqua test as a standalone ci job (JuliaLang#537)
df0a154 Add versioned Manifest files to .gitignore (JuliaLang#534)
4606755 Extend `copytrito!` for a sparse source (JuliaLang#533)
33fbc75 SparseMatrixCSC constructor with a Tuple of Integers (JuliaLang#523)
08d6ae1 CI: don't run `threads` tests in Windows GHA CI (attempt 2) (JuliaLang#530)
7408e4b Revert "Don't fail CI if codecov upload fails." (JuliaLang#527)
287e406 Bump julia-actions/setup-julia from 1 to 2 (JuliaLang#524)
b5de0da Don't fail CI if codecov upload fails. (JuliaLang#525)
78dde4c cast to Float64 directly instead of using float (JuliaLang#521)
a5e95ec CI: Add Apple Silicon (macOS aarch64) to the CI matrix (JuliaLang#505)
```

Co-authored-by: Dilum Aluthge <[email protected]>
lazarusA pushed a commit to lazarusA/julia that referenced this issue Jul 12, 2024
…aLang#54406)

Stdlib: SparseArrays
URL: https://github.com/JuliaSparse/SparseArrays.jl.git
Stdlib branch: main
Julia branch: master
Old commit: cb602d7
New commit: a09f90b
Julia version: 1.12.0-DEV
SparseArrays version: 1.12.0
Bump invoked by: @dkarrasch
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaSparse/SparseArrays.jl@cb602d7...a09f90b

```
$ git log --oneline cb602d7..a09f90b
a09f90b Adjust matvec and matmatmul! to new internal LinAlg interface (JuliaLang#519)
3b30333 ci: run aqua test as a standalone ci job (JuliaLang#537)
df0a154 Add versioned Manifest files to .gitignore (JuliaLang#534)
4606755 Extend `copytrito!` for a sparse source (JuliaLang#533)
33fbc75 SparseMatrixCSC constructor with a Tuple of Integers (JuliaLang#523)
08d6ae1 CI: don't run `threads` tests in Windows GHA CI (attempt 2) (JuliaLang#530)
7408e4b Revert "Don't fail CI if codecov upload fails." (JuliaLang#527)
287e406 Bump julia-actions/setup-julia from 1 to 2 (JuliaLang#524)
b5de0da Don't fail CI if codecov upload fails. (JuliaLang#525)
78dde4c cast to Float64 directly instead of using float (JuliaLang#521)
a5e95ec CI: Add Apple Silicon (macOS aarch64) to the CI matrix (JuliaLang#505)
```

Co-authored-by: Dilum Aluthge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

4 participants