Skip to content

Update llama.cpp to b7709#905

Merged
MarcusDunn merged 3 commits intoutilityai:mainfrom
AsbjornOlling:bump-llamacpp-9e41884
Jan 13, 2026
Merged

Update llama.cpp to b7709#905
MarcusDunn merged 3 commits intoutilityai:mainfrom
AsbjornOlling:bump-llamacpp-9e41884

Conversation

@AsbjornOlling
Copy link
Contributor

The upstream version of llama.cpp has been stuck the same version for a month, because of some dependency changes in their build system.

I opened ggml-org/llama.cpp#18670 to fix it, which was just now merged.

This PR just bumps the submodule commit, to use the latest version of llama.cpp

@AsbjornOlling
Copy link
Contributor Author

It looks like the tests are failing because llama.cpp changed the default value of n_gpu_layers from 999 to -1.

ggml-org/llama.cpp@026d2ad47#diff-36e262e316ec1404e29880eb8b8ce4660ac584f0d0434710efc48a66497bdb59L7797

@AsbjornOlling
Copy link
Contributor Author

It looks like the cuda build failure was due to lack of disk space on the github actions runner?
Not sure GH Actions deals with disk space. Maybe we were just unlucky to be on a low-disk runner, and triggering the pipeline again will fix it?

Here are the cuda build failure logs, showing disk space errors
#7 3064.5   [ 97%] Linking CXX executable ../../bin/llama-quantize
#7 3064.5   [ 97%] Built target llama-perplexity
#7 3064.5   [ 97%] Building CXX object tools/tokenize/CMakeFiles/llama-tokenize.dir/tokenize.cpp.o
#7 3064.5   [ 98%] Linking CXX executable ../../bin/llama-tokenize
#7 3064.5 
#7 3064.5   --- stderr
#7 3064.5   running: cd "/target/debug/build/llama-cpp-sys-2-a081c722b10463e6/out/build" && CMAKE_PREFIX_PATH="" LC_ALL="C" "cmake" "/llama-cpp-sys-2/llama.cpp" "-B" "/target/debug/build/llama-cpp-sys-2-a081c722b10463e6/out/build" "-DLLAMA_BUILD_TESTS=OFF" "-DLLAMA_BUILD_EXAMPLES=OFF" "-DLLAMA_BUILD_SERVER=OFF" "-DLLAMA_BUILD_TOOLS=OFF" "-DLLAMA_CURL=OFF" "-DLLAMA_BUILD_COMMON=ON" "-DLLAMA_BUILD_TOOLS=ON" "-DCMAKE_BUILD_PARALLEL_LEVEL=4" "-DGGML_NATIVE=OFF" "-DBUILD_SHARED_LIBS=OFF" "-DGGML_CUDA=ON" "-DGGML_OPENMP=ON" "-DCMAKE_INSTALL_PREFIX=/target/debug/build/llama-cpp-sys-2-a081c722b10463e6/out" "-DCMAKE_C_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64" "-DCMAKE_C_COMPILER=/usr/bin/cc" "-DCMAKE_CXX_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64" "-DCMAKE_CXX_COMPILER=/usr/bin/c++" "-DCMAKE_ASM_FLAGS= -ffunction-sections -fdata-sections -fPIC -m64" "-DCMAKE_ASM_COMPILER=/usr/bin/cc" "-DCMAKE_BUILD_TYPE=Release"
#7 3064.5   CMAKE_BUILD_TYPE=Release
#7 3064.5   fatal: not a git repository: /llama-cpp-sys-2/llama.cpp/../../../../1/fs/modules/llama-cpp-sys-2/llama.cpp
#7 3064.5   fatal: not a git repository: /llama-cpp-sys-2/llama.cpp/../../../../1/fs/modules/llama-cpp-sys-2/llama.cpp
#7 3064.5   CMake Warning at common/CMakeLists.txt:29 (message):
#7 3064.5     Git index not found in git repository.
#7 3064.5 
#7 3064.5 
#7 3064.5   CMake Warning:
#7 3064.5     Manually-specified variables were not used by the project:
#7 3064.5 
#7 3064.5       CMAKE_BUILD_PARALLEL_LEVEL
#7 3064.5 
#7 3064.5 
#7 3064.5   running: cd "/target/debug/build/llama-cpp-sys-2-a081c722b10463e6/out/build" && LC_ALL="C" MAKEFLAGS="-j --jobserver-fds=9,12 --jobserver-auth=9,12" "cmake" "--build" "/target/debug/build/llama-cpp-sys-2-a081c722b10463e6/out/build" "--target" "install" "--config" "Release"
#7 3064.5   gmake: warning: -j4 forced in submake: resetting jobserver mode.
#7 3064.5   /usr/bin/ld: final link failed: No space left on device
#7 3064.5   /usr/bin/ld: final link failed: No space left on device
#7 3064.5   collect2: error: ld returned 1 exit status
#7 3064.5   gmake[2]: *** [tools/llama-bench/CMakeFiles/llama-bench.dir/build.make:109: bin/llama-bench] Error 1
#7 3064.5   gmake[1]: *** [CMakeFiles/Makefile2:792: tools/llama-bench/CMakeFiles/llama-bench.dir/all] Error 2
#7 3064.5   gmake[1]: *** Waiting for unfinished jobs....
#7 3064.5   collect2: error: ld returned 1 exit status
#7 3064.5   gmake[2]: *** [tools/quantize/CMakeFiles/llama-quantize.dir/build.make:109: bin/llama-quantize] Error 1
#7 3064.5   gmake[1]: *** [CMakeFiles/Makefile2:891: tools/quantize/CMakeFiles/llama-quantize.dir/all] Error 2
#7 3064.5   /usr/bin/ld: final link failed: No space left on device
#7 3064.5   /usr/bin/ld: final link failed: No space left on device
#7 3064.5   collect2: error: ld returned 1 exit status
#7 3064.5   gmake[2]: *** [tools/imatrix/CMakeFiles/llama-imatrix.dir/build.make:109: bin/llama-imatrix] Error 1
#7 3064.5   gmake[1]: *** [CMakeFiles/Makefile2:759: tools/imatrix/CMakeFiles/llama-imatrix.dir/all] Error 2
#7 3064.5   collect2: error: ld returned 1 exit status
#7 3064.5   gmake[2]: *** [tools/tokenize/CMakeFiles/llama-tokenize.dir/build.make:109: bin/llama-tokenize] Error 1
#7 3064.5   gmake[1]: *** [CMakeFiles/Makefile2:924: tools/tokenize/CMakeFiles/llama-tokenize.dir/all] Error 2
#7 3064.5   gmake: *** [Makefile:136: all] Error 2

@AsbjornOlling
Copy link
Contributor Author

Will also bump to latest llama.cpp again

@AsbjornOlling AsbjornOlling changed the title Update llama.cpp to latest commit on master: 9e41884 Update llama.cpp to b7709 Jan 12, 2026
@AsbjornOlling
Copy link
Contributor Author

AsbjornOlling commented Jan 12, 2026

Odd. The cuda build still fails in CI (ran the GH Actions on my fork), but building the test-build.Dockerfile locally works fine.

@MarcusDunn
Copy link
Contributor

don't worry about the CUDA build - Github actions is no good.

@AsbjornOlling
Copy link
Contributor Author

don't worry about the CUDA build - Github actions is no good.

Cool cool. I was also getting around to being convinced that it's likely an OOM error, or something like that...

Are we ready to merge this, then?

@MarcusDunn MarcusDunn merged commit 6d57a06 into utilityai:main Jan 13, 2026
3 of 5 checks passed
@AsbjornOlling
Copy link
Contributor Author

Thanks for merging this, and for your ongoing maintainership of this crate! 🫂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants