This project serves as an example on how to integrate the LLAMASharp library into your Unity project.
To run this project you need to download the GGUF model into Assets/StreamingAssets folder.
e.g. LLAMA-7B GGUF
Note: You only need single .gguf file. Files there usually differ by their quantization level, see details in the model's readme.
This project contains a single scene at Assets/Scenes/SampleScene.unity with a simple chat UI.
Monobehaviour that has all the logic is named LLamaSharpTestScript and it's already added and set up in Example GameObject.
It generally follows LLAMASharp readme example and shows how to switch between different chat sessions.
Before running the project you should point LLamaSharpTestScript.ModelPath from the inspector to your model path in StreamingAssets.
Additionally this project uses the following packages:
- UniTask
For integrating async tasks with Unity and offloading blocking tasks to thread pool. - NuGetForUnity
For fetching LLAMASharp and all it's dependencies from NuGet.
- Install UniTask and NuGetForUnity via PackageManager.
- Install LLAMASharp via NuGetForUnity.
- Manually download one of the LLAMASharp.Backend.xx NuGet packages (LLamaSharp and Backend versions must match exactly!)
- Unpack using ZIP, move
runtimes/<your runtime>/libllama.dllto your Unity project Assets. (Note: dll must be calledlibllama.dllto be found. If it's namedllama.dll- rename it when adding to the Unity project.) - Move the model to the StreamingAssets folder
- Move
LLamaSharpBuildPostprocessorinto your project, or write your own for targets other than windows (seeBuild and distribution). - Download CUDA Runtime dlls and add them to your project to be able to run the build on systems w/o CUDA installed. For Windows-x64 target you can download them from llama.cpp releases here, you need files named cudart-llama-bin-win-cu#.#.#-x64.zip where #.#.# is your LLamaSharp backend's CUDA version.
- Make sure that your LLAMASharp library and backend are of the same version. Same with CUDA version of LLAMASharp.Backend.CUDA and your installed CUDA (or CUDA RT).
At this point you should be able to copy example from this project and run it in yours.
There is some issue with LLamaSharp finding libllama.dll in build's plugin directory.
As a quick workaround there is LLamaSharpBuildPostprocessor that copies libllama.dll into the same directory as the .exe.
It only supports windows build target but you can adapt it to work with other targets, or just resolve this manually each build.
If you have a project using LLamaSharp together with Unity and want it to appear here, please create an issue and I will add it!