You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
After b2950 patch RPC functionality is broken. Offloading to 3 machines first server crashes with below message. Reverting back to b2949 fixes the problem.
ll_startrpc
create_backend: using CUDA backend
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: yes
ggml_cuda_init: CUDA_USE_TENSOR_CORES: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce GTX 1070, compute capability 6.1, VMM: yes
Starting RPC server on 0.0.0.0:50052, backend memory: 8022 MB
Accepted client connection, free_mem=8412266496, total_mem=8500477952
GGML_ASSERT: /usr/local/src/ai/llamacpp/llama.cpp/ggml-backend.c:226: offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds"
[New LWP 11678]
[New LWP 11684]
[New LWP 11685]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f66930dc3c7 in wait4 () from /lib64/libc.so.6
#0 0x00007f66930dc3c7 in wait4 () from /lib64/libc.so.6 #1 0x0000000000411f4b in ggml_print_backtrace () #2 0x000000000046639a in ggml_backend_tensor_set () #3 0x0000000000541d20 in start_rpc_server () #4 0x0000000000406ebc in main ()
[Inferior 1 (process 11677) detached]
/usr/local/bin/ll_startrpc: line 14: 11677 Aborted rpc-server -H 0.0.0.0 -p 50052
The text was updated successfully, but these errors were encountered:
You are most probably running an old rpc-server with a new build of llama.cpp. We have added #pragma pack(push, 1) to rpc_tensor and now it is serialized into 292 bytes instead of 296 bytes. Make sure that you are building rpc-server from the same source tree as the rest of the binaries.
I may add a new HELLO command which advertises the version of the rpc-server when a new client connects. This may prevent problems like this in the long term.
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
After b2950 patch RPC functionality is broken. Offloading to 3 machines first server crashes with below message. Reverting back to b2949 fixes the problem.
ll_startrpc
create_backend: using CUDA backend
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: yes
ggml_cuda_init: CUDA_USE_TENSOR_CORES: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce GTX 1070, compute capability 6.1, VMM: yes
Starting RPC server on 0.0.0.0:50052, backend memory: 8022 MB
Accepted client connection, free_mem=8412266496, total_mem=8500477952
GGML_ASSERT: /usr/local/src/ai/llamacpp/llama.cpp/ggml-backend.c:226: offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds"
[New LWP 11678]
[New LWP 11684]
[New LWP 11685]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f66930dc3c7 in wait4 () from /lib64/libc.so.6
#0 0x00007f66930dc3c7 in wait4 () from /lib64/libc.so.6
#1 0x0000000000411f4b in ggml_print_backtrace ()
#2 0x000000000046639a in ggml_backend_tensor_set ()
#3 0x0000000000541d20 in start_rpc_server ()
#4 0x0000000000406ebc in main ()
[Inferior 1 (process 11677) detached]
/usr/local/bin/ll_startrpc: line 14: 11677 Aborted rpc-server -H 0.0.0.0 -p 50052
The text was updated successfully, but these errors were encountered: