[Feature]: Add llama.cpp integration to TheRock#1652
Conversation
|
I changed PR to Draft since currently llama.cpp is integrated with the submodules instead of it being managed via Python scripts. |
f6f2acf to
bf5ecf4
Compare
0cd9505 to
54c91cb
Compare
|
I've updated the test scripts and verified that they work correctly on Linux. As part of this, I created a Python script that automates running the integrated llama.cpp tests, which could be useful for CI/CD workflows. The script currently skips test-backend-ops due to a floating-point exception issue on GFX1030 (tested using GFX1031 via I'm still unsure whether we need additional sanity checks. My initial thought was to verify that the built binaries are linked against HIP/ROCm, but that might be overkill. Perhaps running a minimal binary to confirm functionality would be sufficient. Also, I've switched to using the upstream |
|
I did some changes related to llama.cpp PR. When someone has a chance, I'd appreciate a review. Thanks in advance! |
bcd412d to
a385e80
Compare
- Solved the libomp.so linking issue by setting CMake RPATHs:
-DCMAKE_BUILD_RPATH=\"${THEROCK_BINARY_DIR}/external-builds/llama.cpp/llamacpp/dist/lib/llvm/lib\"
-DCMAKE_INSTALL_RPATH=\"${THEROCK_BINARY_DIR}/external-builds/llama.cpp/llamacpp/dist/lib/llvm/lib\"
- Removed llama.cpp as a Git submodule.
- Integrated `llamacpp_repo.py` and `repo_management.py` scripts to manage llama.cpp."
…ists integration into TheRock. - Removed CMakeLists.txt and artifact config for llama.cpp - Added new Python-based build script: llamacpp_build.py - Migrated and renamed repo management scripts: - llamacpp_repo.py - repo_management.py - Restored root CMakeLists.txt to reflect new integration approach
…ma.cpp - Modify llamacpp_build to support GPU, reordered llvm/amdgcn, added an option to build for multiple gpus and clean argument - Modify llamacpp_repo to clone by default upstream llama.cpp
- Add logic to support Windows installation. - Switch ROCm discovery mechanism in the build script to use `rocm-sdk`. - Update ROCm target detection to also rely on `rocm-sdk`. - Enforce the use of Ninja as the build system on Windows. - Introduce the `LLAMACPP_BUILD_DIR` environment variable for `llamacpp_test.py` to locate builds placed outside the default directory.
a385e80 to
060dca8
Compare
Motivation
Add llama.cpp as an external project to TheRock.
llama.cpp offers a lightweight, high-performance implementation for LLM inference, capable of running efficiently on both CPUs and ROCm-enabled GPUs. Including it in TheRock makes it easier for clients to experiment with and deploy language models, accelerating prototyping and evaluation.
Related to: #1449
Technical Details
llama.cpp is managed using the
repo_management.py,llamacpp_repo.pyandllamacpp_build.pyscripts, in a manner similar to how PyTorch is handled as an external project. To execute it, simply run:For a detailed explanation of its workflow, refer to
README.md. The project is currently built using python scripts by cloning upstream repo.Test Plan
Test Result
Submission Checklist