[Feature]: Add llama.cpp integration to TheRock by catan2001 · Pull Request #1652 · ROCm/TheRock

catan2001 · 2025-10-01T13:29:23Z

Motivation

Add llama.cpp as an external project to TheRock.

llama.cpp offers a lightweight, high-performance implementation for LLM inference, capable of running efficiently on both CPUs and ROCm-enabled GPUs. Including it in TheRock makes it easier for clients to experiment with and deploy language models, accelerating prototyping and evaluation.

Related to: #1449

Technical Details

llama.cpp is managed using the repo_management.py, llamacpp_repo.py and llamacpp_build.py scripts, in a manner similar to how PyTorch is handled as an external project. To execute it, simply run:

python llamacpp_repo.py
python llamacpp_build.py --rocm-dir <by default /opt/rocm> --gpu-targets <e.g. gfx1030,gfx1100>

For a detailed explanation of its workflow, refer to README.md. The project is currently built using python scripts by cloning upstream repo.

Test Plan

Built TheRock with llama.cpp included.
Ran a subset of llama.cpp test binaries to confirm successful compilation on Linux.
TODO: create smoke-tests and run full test suite.

Test Result

llama.cpp binaries compiled successfully on Linux.
Basic functionality verified via sample test runs.
Full validation and benchmarking still pending.

Submission Checklist

Read and followed the contributing guidelines: [TheRock Contributing Guide](https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests).
Add llama.cpp managed via Python scripts.
Add README.md
rocWMMA integration. Wait for Add rocWMMA to TheRock #1938.
Completed extended testing on Linux and Windows.

catan2001 · 2025-10-02T12:38:44Z

I changed PR to Draft since currently llama.cpp is integrated with the submodules instead of it being managed via Python scripts.
I will re-open it for review once the script based management is in place.

catan2001 · 2025-10-23T08:51:32Z

I've updated the test scripts and verified that they work correctly on Linux. As part of this, I created a Python script that automates running the integrated llama.cpp tests, which could be useful for CI/CD workflows. The script currently skips test-backend-ops due to a floating-point exception issue on GFX1030 (tested using GFX1031 via HSA_OVERRIDE_GFX_VERSION), but everything runs smoothly on GFX1201 and GFX1100. It also skips all of the tests that require additional downloading since it would slow down the CI/CD pipeline.

I'm still unsure whether we need additional sanity checks. My initial thought was to verify that the built binaries are linked against HIP/ROCm, but that might be overkill. Perhaps running a minimal binary to confirm functionality would be sufficient. Also, I've switched to using the upstream llama.cpp repo instead of ROCm's fork, which was based on a much older commit. Maybe it would be the best if someone could check what I did already.

catan2001 · 2025-10-24T07:19:14Z

I did some changes related to llama.cpp PR. When someone has a chance, I'd appreciate a review.

Thanks in advance!

- Solved the libomp.so linking issue by setting CMake RPATHs: -DCMAKE_BUILD_RPATH=\"${THEROCK_BINARY_DIR}/external-builds/llama.cpp/llamacpp/dist/lib/llvm/lib\" -DCMAKE_INSTALL_RPATH=\"${THEROCK_BINARY_DIR}/external-builds/llama.cpp/llamacpp/dist/lib/llvm/lib\" - Removed llama.cpp as a Git submodule. - Integrated `llamacpp_repo.py` and `repo_management.py` scripts to manage llama.cpp."

…ists integration into TheRock. - Removed CMakeLists.txt and artifact config for llama.cpp - Added new Python-based build script: llamacpp_build.py - Migrated and renamed repo management scripts: - llamacpp_repo.py - repo_management.py - Restored root CMakeLists.txt to reflect new integration approach

…ma.cpp - Modify llamacpp_build to support GPU, reordered llvm/amdgcn, added an option to build for multiple gpus and clean argument - Modify llamacpp_repo to clone by default upstream llama.cpp

- Add logic to support Windows installation. - Switch ROCm discovery mechanism in the build script to use `rocm-sdk`. - Update ROCm target detection to also rely on `rocm-sdk`. - Enforce the use of Ninja as the build system on Windows. - Introduce the `LLAMACPP_BUILD_DIR` environment variable for `llamacpp_test.py` to locate builds placed outside the default directory.

github-project-automation Bot added this to TheRock Triage Oct 1, 2025

github-project-automation Bot moved this to TODO in TheRock Triage Oct 1, 2025

catan2001 changed the title ~~[FEATURE]: Add llama.cpp integration to TheRock~~ [Feature]: Add llama.cpp integration to TheRock Oct 2, 2025

catan2001 marked this pull request as draft October 2, 2025 12:27

catan2001 force-pushed the llamacpp-integration branch 2 times, most recently from f6f2acf to bf5ecf4 Compare October 13, 2025 12:59

catan2001 force-pushed the llamacpp-integration branch from 0cd9505 to 54c91cb Compare October 23, 2025 07:52

catan2001 marked this pull request as ready for review October 23, 2025 13:20

catan2001 mentioned this pull request Oct 29, 2025

[Issue]: Running hipconfig --rocmpath from Python packages uses 'core' instead of 'devel' and does not include HIP runtime CMake files #1880

Closed

catan2001 force-pushed the llamacpp-integration branch from bcd412d to a385e80 Compare November 7, 2025 15:56

catan2001 added 7 commits November 13, 2025 08:41

[FEATURE]: Add llama.cpp integration to TheRock

d2a5267

[Feature]: Add README

b733aa0

[Feature/Fix]: Add llamacpp_test.py for automating tests built by lla…

1768714

…ma.cpp - Modify llamacpp_build to support GPU, reordered llvm/amdgcn, added an option to build for multiple gpus and clean argument - Modify llamacpp_repo to clone by default upstream llama.cpp

[Fix]: Update README, enable running ops tests, make scripts executable

a49e572

catan2001 force-pushed the llamacpp-integration branch from a385e80 to 060dca8 Compare November 13, 2025 08:48

catan2001 mentioned this pull request Nov 17, 2025

[Issue]: rocWMMA is not integrated into python packages #2168

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Add llama.cpp integration to TheRock#1652

[Feature]: Add llama.cpp integration to TheRock#1652
catan2001 wants to merge 7 commits into
ROCm:mainfrom
catan2001:llamacpp-integration

catan2001 commented Oct 1, 2025 •

edited

Loading

Uh oh!

catan2001 commented Oct 2, 2025

Uh oh!

catan2001 commented Oct 23, 2025 •

edited

Loading

Uh oh!

catan2001 commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

catan2001 commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

catan2001 commented Oct 2, 2025

Uh oh!

catan2001 commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

catan2001 commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

catan2001 commented Oct 1, 2025 •

edited

Loading

catan2001 commented Oct 23, 2025 •

edited

Loading