File tree Expand file tree Collapse file tree 2 files changed +19
-5
lines changed
components/backends/llama_cpp Expand file tree Collapse file tree 2 files changed +19
-5
lines changed Original file line number Diff line number Diff line change @@ -168,12 +168,16 @@ To specify which GPUs to use set environment variable `CUDA_VISIBLE_DEVICES`.
168168## SGLang
169169
170170```
171+ # Install libnuma
172+ apt install -y libnuma-dev
173+
171174uv pip install ai-dynamo[sglang]
172175```
173176
174177Run the backend/worker like this:
175178```
176- python -m dynamo.sglang.worker --help #Note the '.worker' in the module path for SGLang
179+ # Note the '.worker' in the module path for SGLang
180+ python -m dynamo.sglang.worker --help
177181```
178182
179183You can pass any sglang flags directly to this worker, see https://docs.sglang.ai/backend/server_arguments.html . See there to use multiple GPUs.
@@ -203,7 +207,7 @@ sudo apt-get -y install libopenmpi-dev
203207
204208### After installing the pre-requisites above, install Dynamo
205209```
206- uv pip install --upgrade pip setuptools && uv pip install ai-dynamo[trtllm]
210+ uv pip install ai-dynamo[trtllm]
207211```
208212
209213Run the backend/worker like this:
@@ -273,9 +277,12 @@ maturin develop --uv
273277```
274278cd $PROJECT_ROOT
275279uv pip install .
280+ # For development, use
281+ export PYTHONPATH="${PYTHONPATH}:$(pwd)/components/frontend/src:$(pwd)/components/planner/src:$(pwd)/components/backends/vllm/src:$(pwd)/components/backends/sglang/src:$(pwd)/components/backends/trtllm/src:$(pwd)/components/backends/llama_cpp/src:$(pwd)/components/backends/mocker/src"
276282```
277283
278- Note editable (` -e ` ) does not work because the ` dynamo ` package is split over multiple directories, one per backend.
284+ > [ !Note]
285+ > Editable (` -e ` ) does not work because the ` dynamo ` package is split over multiple directories, one per backend.
279286
280287You should now be able to run ` python -m dynamo.frontend ` .
281288
Original file line number Diff line number Diff line change 11# llama.cpp engine for Dynamo
22
33Usage:
4- - ` pip install -r requirements.txt ` # Need a recent pip, ` uv pip ` might be too old.
5- - ` python -m dynamo.llama_cpp --model-path /data/models/Qwen3-0.6B-Q8_0.gguf [args] `
4+ ```
5+ # Install ai-dynamo llama.cpp backend (CPU Mode)
6+ pip install "ai-dynamo[llama_cpp]"
7+
8+ # [Optional] To build llama.cpp for CUDA (needs a recent pip)
9+ pip install -r --force-reinstall requirements.gpu.txt
10+
11+ python -m dynamo.llama_cpp --model-path /data/models/Qwen3-0.6B-Q8_0.gguf [args]
12+ ```
613
714## Request Migration
815
You can’t perform that action at this time.
0 commit comments