Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

luajit2 internal state corruption leads to segmentation fault #102

Closed
neoxic opened this issue Sep 16, 2020 · 3 comments
Closed

luajit2 internal state corruption leads to segmentation fault #102

neoxic opened this issue Sep 16, 2020 · 3 comments

Comments

@neoxic
Copy link

neoxic commented Sep 16, 2020

Hello,
First of all, thank you for your brilliant work on this LuaJIT alternative!

This bug had been first reported upstream, but got rejected without explanation. Please see the following link:
LuaJIT/LuaJIT#615

I can also see that it may be related to currently open luajit2 issues #97 and #98, so there might be an elevated need to get it fixed as soon as possible. We have been successful to concoct a one-file test case that reproduces the bug. Please see below a reverse debugging session with it under luajit2 HEAD (read it backwards):

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000033 in ?? ()
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-39.el7.x86_64
(rr) bt
#0  0x0000000000000033 in ?? ()
#1  0x0000000000457419 in lj_fff_fallback () at buildvm_x86.dasc:2243
#2  0x00000000004565f5 in lj_ff_coroutine_resume () at buildvm_x86.dasc:1733
#3  0x000000000044d106 in lua_pcall (L=L@entry=0x7f709785a378, nargs=nargs@entry=0, nresults=-1, errfunc=errfunc@entry=2) at lj_api.c:1131
#4  0x00000000004042e7 in docall (L=L@entry=0x7f709785a378, narg=narg@entry=0, clear=clear@entry=0) at luajit.c:121
#5  0x0000000000404af1 in handle_script (L=L@entry=0x7f709785a378, argx=argx@entry=0x7ffe6eb69520) at luajit.c:292
#6  0x0000000000405167 in pmain (L=0x7f709785a378) at luajit.c:553
#7  0x00000000004557a5 in lj_BC_FUNCC () at buildvm_x86.dasc:849
#8  0x000000000044d2b0 in lua_cpcall (L=L@entry=0x7f709785a378, func=func@entry=0x405044 <pmain>, ud=ud@entry=0x0) at lj_api.c:1155
#9  0x0000000000405224 in main (argc=2, argv=0x7ffe6eb69518) at luajit.c:582
(rr) info reg rax
rax            0x7f709681a820   140121538144288
(rr) watch -l ((GCfuncC*)0x7f709681a820)->f
Hardware watchpoint 1: -location ((GCfuncC*)0x7f709681a820)->f
(rr) rc
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000033 in ?? ()
(rr) rc
Continuing.
Hardware watchpoint 1: -location ((GCfuncC*)0x7f709681a820)->f

Old value = (int (*)(lua_State *)) 0x33
New value = (int (*)(lua_State *)) 0x167c1
0x0000000000411c2a in lj_alloc_malloc (msp=0x7f709785a010, nsize=<optimized out>) at lj_alloc.c:1338
1338          set_size_and_pinuse_of_inuse_chunk(ms, p, nb);
(rr) bt
#0  0x0000000000411c2a in lj_alloc_malloc (msp=0x7f709785a010, nsize=<optimized out>) at lj_alloc.c:1338
#1  0x0000000000411e15 in lj_alloc_f (msp=<optimized out>, ptr=<optimized out>, osize=<optimized out>, nsize=<optimized out>) at lj_alloc.c:1486
#2  0x0000000000428a4b in lj_mem_newgco (L=L@entry=0x7f709785a378, size=40) at lj_gc.c:834
#3  0x000000000042954c in func_newL (L=L@entry=0x7f709785a378, pt=pt@entry=0x7f7096819190, env=0x7f709785bd40) at lj_func.c:122
#4  0x0000000000431489 in lj_func_newL_gc (L=0x7f709785a378, pt=0x7f7096819190, parent=0x7f709681a340) at lj_func.c:160
#5  0x0000000000454879 in lj_BC_FNEW () at buildvm_x86.dasc:562
#6  0x000000000044c3ff in lua_call (L=L@entry=0x7f709785a378, nargs=nargs@entry=1, nresults=nresults@entry=1) at lj_api.c:1113
#7  0x000000000044ced0 in lj_cf_package_require (L=0x7f709785a378) at lib_package.c:459
#8  0x00000000004557a5 in lj_BC_FUNCC () at buildvm_x86.dasc:849
#9  0x000000000044d106 in lua_pcall (L=L@entry=0x7f709785a378, nargs=nargs@entry=0, nresults=-1, errfunc=errfunc@entry=2) at lj_api.c:1131
#10 0x00000000004042e7 in docall (L=L@entry=0x7f709785a378, narg=narg@entry=0, clear=clear@entry=0) at luajit.c:121
#11 0x0000000000404af1 in handle_script (L=L@entry=0x7f709785a378, argx=argx@entry=0x7ffe6eb69520) at luajit.c:292
#12 0x0000000000405167 in pmain (L=0x7f709785a378) at luajit.c:553
#13 0x00000000004557a5 in lj_BC_FUNCC () at buildvm_x86.dasc:849
#14 0x000000000044d2b0 in lua_cpcall (L=L@entry=0x7f709785a378, func=func@entry=0x405044 <pmain>, ud=ud@entry=0x0) at lj_api.c:1155
#15 0x0000000000405224 in main (argc=2, argv=0x7ffe6eb69518) at luajit.c:582
(rr) rc
Continuing.
Hardware watchpoint 1: -location ((GCfuncC*)0x7f709681a820)->f

Old value = (int (*)(lua_State *)) 0x167c1
New value = (int (*)(lua_State *)) 0x0
0x0000000000411c1e in lj_alloc_malloc (msp=0x7f709785a010, nsize=<optimized out>) at lj_alloc.c:1337
1337          set_size_and_pinuse_of_free_chunk(r, rsize);
(rr) bt
#0  0x0000000000411c1e in lj_alloc_malloc (msp=0x7f709785a010, nsize=<optimized out>) at lj_alloc.c:1337
#1  0x0000000000411e15 in lj_alloc_f (msp=<optimized out>, ptr=<optimized out>, osize=<optimized out>, nsize=<optimized out>) at lj_alloc.c:1486
#2  0x0000000000428a4b in lj_mem_newgco (L=L@entry=0x7f709785a378, size=40) at lj_gc.c:834
#3  0x000000000042954c in func_newL (L=L@entry=0x7f709785a378, pt=pt@entry=0x7f7096816dc8, env=0x7f709785bd40) at lj_func.c:122
#4  0x0000000000431489 in lj_func_newL_gc (L=0x7f709785a378, pt=0x7f7096816dc8, parent=0x7f709681a340) at lj_func.c:160
#5  0x0000000000454879 in lj_BC_FNEW () at buildvm_x86.dasc:562
#6  0x000000000044c3ff in lua_call (L=L@entry=0x7f709785a378, nargs=nargs@entry=1, nresults=nresults@entry=1) at lj_api.c:1113
#7  0x000000000044ced0 in lj_cf_package_require (L=0x7f709785a378) at lib_package.c:459
#8  0x00000000004557a5 in lj_BC_FUNCC () at buildvm_x86.dasc:849
#9  0x000000000044d106 in lua_pcall (L=L@entry=0x7f709785a378, nargs=nargs@entry=0, nresults=-1, errfunc=errfunc@entry=2) at lj_api.c:1131
#10 0x00000000004042e7 in docall (L=L@entry=0x7f709785a378, narg=narg@entry=0, clear=clear@entry=0) at luajit.c:121
#11 0x0000000000404af1 in handle_script (L=L@entry=0x7f709785a378, argx=argx@entry=0x7ffe6eb69520) at luajit.c:292
#12 0x0000000000405167 in pmain (L=0x7f709785a378) at luajit.c:553
#13 0x00000000004557a5 in lj_BC_FUNCC () at buildvm_x86.dasc:849
#14 0x000000000044d2b0 in lua_cpcall (L=L@entry=0x7f709785a378, func=func@entry=0x405044 <pmain>, ud=ud@entry=0x0) at lj_api.c:1155
#15 0x0000000000405224 in main (argc=2, argv=0x7ffe6eb69518) at luajit.c:582
(rr) rc
Continuing.
Hardware watchpoint 1: -location ((GCfuncC*)0x7f709681a820)->f

Old value = (int (*)(lua_State *)) 0x0
New value = <unreadable>
_raw_syscall () at /opt/APP/master/rr/src/preload/raw_syscall.S:120
120             callq *32(%rsp)
(rr) bt
#0  _raw_syscall () at /opt/APP/master/rr/src/preload/raw_syscall.S:120
#1  0x00007f7097558055 in traced_raw_syscall (call=call@entry=0x681fffa0) at /opt/APP/master/rr/src/preload/syscallbuf.c:252
#2  0x00007f709755ae4f in syscall_hook_internal (call=0x681fffa0) at /opt/APP/master/rr/src/preload/syscallbuf.c:3202
#3  syscall_hook (call=0x681fffa0) at /opt/APP/master/rr/src/preload/syscallbuf.c:3236
#4  0x00007f7097557e80 in _syscall_hook_trampoline () at /opt/APP/master/rr/src/preload/syscall_hook.S:313
#5  0x00007f7097557edf in __morestack () at /opt/APP/master/rr/src/preload/syscall_hook.S:458
#6  0x00007f7097557efa in _syscall_hook_trampoline_48_3d_00_f0_ff_ff () at /opt/APP/master/rr/src/preload/syscall_hook.S:469
#7  0x00007f7096b65da0 in mmap64 () from /lib64/libc.so.6
#8  0x0000000000411736 in mmap_probe (size=size@entry=131072) at lj_alloc.c:258
#9  0x0000000000411918 in alloc_sys (m=m@entry=0x7f709785a010, nb=nb@entry=552) at lj_alloc.c:1013
#10 0x0000000000411c94 in lj_alloc_malloc (msp=0x7f709785a010, nsize=<optimized out>) at lj_alloc.c:1356
#11 0x0000000000411e15 in lj_alloc_f (msp=<optimized out>, ptr=<optimized out>, osize=<optimized out>, nsize=<optimized out>) at lj_alloc.c:1486
#12 0x000000000041829e in lj_mem_realloc (L=0x7f709785a378, p=p@entry=0x0, osz=osz@entry=0, nsz=544) at lj_gc.c:821
#13 0x0000000000426685 in lj_trace_alloc (L=<optimized out>, T=T@entry=0x7f709785a688) at lj_trace.c:128
#14 0x000000000042f12d in lj_asm_trace (J=J@entry=0x7f709785a688, T=T@entry=0x7f709785a688) at lj_asm.c:2286
#15 0x0000000000445750 in trace_state (L=0x7f709785a378, dummy=<optimized out>, ud=0x7f709785a688) at lj_trace.c:698
#16 0x0000000000455b9b in lj_vm_cpcall () at buildvm_x86.dasc:1237
#17 0x00000000004108c5 in lj_trace_ins (J=J@entry=0x7f709785a688, pc=pc@entry=0x7f70977e4220) at lj_trace.c:730
#18 0x0000000000443d92 in lj_dispatch_ins (L=0x7f709785a378, pc=0x7f70977e4224) at lj_dispatch.c:424
#19 0x000000000045752d in lj_vm_inshook () at buildvm_x86.dasc:2462
#20 0x000000000044c3ff in lua_call (L=L@entry=0x7f709785a378, nargs=nargs@entry=1, nresults=nresults@entry=1) at lj_api.c:1113
#21 0x000000000044ced0 in lj_cf_package_require (L=0x7f709785a378) at lib_package.c:459
#22 0x00000000004557a5 in lj_BC_FUNCC () at buildvm_x86.dasc:849
#23 0x000000000044d106 in lua_pcall (L=L@entry=0x7f709785a378, nargs=nargs@entry=0, nresults=-1, errfunc=errfunc@entry=2) at lj_api.c:1131
#24 0x00000000004042e7 in docall (L=L@entry=0x7f709785a378, narg=narg@entry=0, clear=clear@entry=0) at luajit.c:121
#25 0x0000000000404af1 in handle_script (L=L@entry=0x7f709785a378, argx=argx@entry=0x7ffe6eb69520) at luajit.c:292
#26 0x0000000000405167 in pmain (L=0x7f709785a378) at luajit.c:553
#27 0x00000000004557a5 in lj_BC_FUNCC () at buildvm_x86.dasc:849
#28 0x000000000044d2b0 in lua_cpcall (L=L@entry=0x7f709785a378, func=func@entry=0x405044 <pmain>, ud=ud@entry=0x0) at lj_api.c:1155
#29 0x0000000000405224 in main (argc=2, argv=0x7ffe6eb69518) at luajit.c:582
(rr)

The test case is kind of Frankenstein-style, but nonetheless, it is one test.lua file written in pure Lua without any external dependencies and runs on vanilla Lua 5.1 and LuaJIT 2.0.5 without any issues. It is clear from the above though that the corruption takes place in LuaJIT's internals and does not involve any external libraries. Please also note that, as stated in the upstream issue(s), the bug is sort of JIT related because we couldn't catch it on LuaJIT built with LUAJIT_DISABLE_JIT. The debugging session above shows that as well since you can see initial references to lj_trace.c.

We're not quite comfortable publishing the test case here. However, we will be more than happy to send it via email privately to all interested parties. Moreover, the above debugging session is left active, so please feel free to inquire for additional info. We are willing to help nail the bug down as much as we can.

FYI luajit2 is built with:

make amalg XCFLAGS="-DLUA_USE_APICHECK -Og -g"

Please note that, as stated in the upstream issue, we first cleared all internal assertions building LuaJIT 2.1 with LUA_USE_ASSERT. We have also tried to use OpenResty gdb extensions to get Lua backtraces etc, but they seem to be unable to work with 64-bit LuaJIT.

@agentzh
Copy link
Member

agentzh commented Sep 23, 2020

@neoxic Thanks for the report. @siddhesh Will you mind having a look at this? Thanks!

@neoxic
Copy link
Author

neoxic commented Oct 12, 2020

Please note that this has been fixed upstream. FYI LuaJIT/LuaJIT#624

@agentzh
Copy link
Member

agentzh commented Oct 13, 2020

@neoxic I've just merged the upstream v2.1 branch into our v2.1-agentzh branch. Please check it out. Thanks!

@agentzh agentzh closed this as completed Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants