Skip to content

Commit 4ad281f

Browse files
authored
refactor: Move TRTLLM example to the component/backends (#1976)
1 parent 57d24a1 commit 4ad281f

36 files changed

+12
-12
lines changed

examples/tensorrt_llm/README.md renamed to components/backends/trtllm/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -123,13 +123,13 @@ This figure shows an overview of the major components to deploy:
123123
124124
#### Aggregated
125125
```bash
126-
cd $DYNAMO_ROOT/examples/tensorrt_llm
126+
cd $DYNAMO_HOME/components/backends/trtllm
127127
./launch/agg.sh
128128
```
129129

130130
#### Aggregated with KV Routing
131131
```bash
132-
cd $DYNAMO_ROOT/examples/tensorrt_llm
132+
cd $DYNAMO_HOME/components/backends/trtllm
133133
./launch/agg_router.sh
134134
```
135135

@@ -139,7 +139,7 @@ cd $DYNAMO_ROOT/examples/tensorrt_llm
139139
> Disaggregated serving supports two strategies for request flow: `"prefill_first"` and `"decode_first"`. By default, the script below uses the `"decode_first"` strategy, which can reduce response latency by minimizing extra hops in the return path. You can switch strategies by setting the `DISAGGREGATION_STRATEGY` environment variable.
140140
141141
```bash
142-
cd $DYNAMO_ROOT/examples/tensorrt_llm
142+
cd $DYNAMO_HOME/components/backends/trtllm
143143
./launch/disagg.sh
144144
```
145145

@@ -149,13 +149,13 @@ cd $DYNAMO_ROOT/examples/tensorrt_llm
149149
> Disaggregated serving with KV routing uses a "prefill first" workflow by default. Currently, Dynamo supports KV routing to only one endpoint per model. In disaggregated workflow, it is generally more effective to route requests to the prefill worker. If you wish to use a "decode first" workflow instead, you can simply set the `DISAGGREGATION_STRATEGY` environment variable accordingly.
150150
151151
```bash
152-
cd $DYNAMO_ROOT/examples/tensorrt_llm
152+
cd $DYNAMO_HOME/components/backends/trtllm
153153
./launch/disagg_router.sh
154154
```
155155

156156
#### Aggregated with Multi-Token Prediction (MTP) and DeepSeek R1
157157
```bash
158-
cd $DYNAMO_ROOT/examples/tensorrt_llm
158+
cd $DYNAMO_HOME/components/backends/trtllm
159159

160160
export AGG_ENGINE_ARGS=./engine_configs/deepseek_r1/mtp/mtp_agg.yaml
161161
export SERVED_MODEL_NAME="nvidia/DeepSeek-R1-FP4"

0 commit comments

Comments
 (0)