Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault(Core dumped) in tinygrad #180

Open
HysenX-LI opened this issue Aug 27, 2024 · 4 comments
Open

Segmentation fault(Core dumped) in tinygrad #180

HysenX-LI opened this issue Aug 27, 2024 · 4 comments

Comments

@HysenX-LI
Copy link

HysenX-LI commented Aug 27, 2024

I run the project on the Ubuntu18.04 and get a Segmentation fault (Core dumped) error.
With using "DEBUG=9 python -X faulthandler main.py“, I got the following error message.
Is the llvmlite module required? It's not in setup.py, but it needs to be used by tinygrad.
Is there any way to solve this problem? Thank you

Fatal Python error: Segmentation fault
Current thread 0x00007f422dfeb740 (most recent call first):
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/runtime/ops_llvm.py", line 54 in
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/helpers.py", line 285 in cpu_time_execution
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/runtime/ops_llvm.py", line 54 in call
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 104 in call
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 173 in run
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 223 in run_schedule
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/tensor.py", line 204 in realize
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/tensor.py", line 3256 in _wrapper
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/tinygrad/nn/state.py", line 129 in load_state_dict
File "/usr1/Project/exo/exo/inference/tinygrad/inference.py", line 51 in build_transformer
File "/usr1/Project/exo/exo/inference/tinygrad/inference.py", line 96 in ensure_shard
File "/usr1/Project/exo/exo/inference/tinygrad/inference.py", line 60 in infer_prompt
File "/usr1/Project/exo/exo/orchestration/standard_node.py", line 140 in _process_prompt
File "/usr1/Project/exo/exo/orchestration/standard_node.py", line 102 in process_prompt
File "/usr1/Project/exo/exo/api/chatgpt_api.py", line 308 in handle_post_chat_completions
File "/usr1/Project/exo/exo/api/chatgpt_api.py", line 253 in middleware
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/aiohttp/web_middlewares.py", line 114 in impl
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/aiohttp/web_app.py", line 537 in _handle
File "/root/anaconda3/envs/EXO/lib/python3.12/site-packages/aiohttp/web_protocol.py", line 459 in _handle_request
File "/root/anaconda3/envs/EXO/lib/python3.12/asyncio/events.py", line 88 in _run
File "/root/anaconda3/envs/EXO/lib/python3.12/asyncio/base_events.py", line 1987 in _run_once
File "/root/anaconda3/envs/EXO/lib/python3.12/asyncio/base_events.py", line 641 in run_forever
File "/root/anaconda3/envs/EXO/lib/python3.12/asyncio/base_events.py", line 674 in run_until_complete
File "/usr1/Project/exo/main.py", line 132 in

Here is the result of a partial stack call to GDB,
#0 0x0000000000000000 in ?? ()
#1 0x00007fffea8ec043 in E_4194304_4 ()
#2 0x00007ffff6b90052 in ffi_call_unix64 () from /root/anaconda3/envs/exo/lib/python3.12/lib-dynload/../../libffi.so.8
#3 0x00007ffff6b8e925 in ffi_call_int () from /root/anaconda3/envs/exo/lib/python3.12/lib-dynload/../../libffi.so.8
#4 0x00007ffff6b8f06e in ffi_call () from /root/anaconda3/envs/exo/lib/python3.12/lib-dynload/../../libffi.so.8
#5 0x00007fffeb2db7b7 in _call_function_pointer (argtypecount=, argcount=2, resmem=0x7fffffffcfc0, restype=, atypes=, avalues=,
pProc=0x7fffea8ec000 <E_4194304_4>, flags=) at /croot/python-split_1715024085344/work/build-static/stgdict.c:931
#6 _ctypes_callproc (pProc=0x7fffea8ec000 <E_4194304_4>, argtuple=0x7fffe463bb80, flags=, argtypes=0x7fffe45f55c0, restype=, checker=0x0)
at /croot/python-split_1715024085344/work/build-static/stgdict.c:1273
#7 0x00007fffeb2e595a in PyCFuncPtr_call () at :4167
#8 0x000000000055afc5 in _PyObject_Call.localalias () at /croot/python-split_1715024085344/_build_env/x86_64-conda-linux-gnu/sysroot/usr/include/bits/pycore_pyerrors.h:367
#9 0x0000000000529d7a in PyCFunction_Call (kwargs=,
args=, callable=)
at /croot/python-split_1715024085344/_build_env/x86_64-conda-linux-gnu/sysroot/usr/include/bits/pycore_pyerrors.h:387
#10 _PyEval_EvalFrameDefault () at /usr/local/src/conda/python-3.12.3/Programs/opcode_targets.h:3254

@AlexCheema
Copy link
Contributor

Does it work when you pip install llvmlite?

@HysenX-LI HysenX-LI closed this as not planned Won't fix, can't repro, duplicate, stale Aug 27, 2024
@HysenX-LI HysenX-LI reopened this Aug 27, 2024
@HysenX-LI
Copy link
Author

This error occurred after the llvmlit installation. If I didn't install llvmlite, the project would have reported an error on previous execution because tinygrad needed to call the library

@HysenX-LI
Copy link
Author

HysenX-LI commented Aug 27, 2024

I'm not sure it's the tinygrad and llvmlite library running on Linux. Because the problem seems to be related to their unsolved functionality. numba/llvmlite#1075 and tinygrad/tinygrad#1367.

@HysenX-LI
Copy link
Author

The numba/llvmlite#1075 issue I mentioned above was an error I got running the project after delete the part of the helper.py file in tinygrad that generated the error.("cb()" in "cpu_time_execution" function). But I think both problems are caused by tinygrad's lack of support for certain Linux instructions. Have you ever run a project on Linux and what is the environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@AlexCheema @HysenX-LI and others