[CI][ROCm] Ship RIXL with `vllm/vllm-openai-rocm` by simondanielsson · Pull Request #41634 · vllm-project/vllm

simondanielsson · 2026-05-04T13:59:02Z

Purpose

RIXL is not readily available in the official vLLM ROCm image:

$ docker run --rm --entrypoint python3  vllm/vllm-openai-rocm:v0.20.1 -c "from rixl._api import nixl_agent; print('RIXL OK')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'rixl'

In contrast to NIXL, there are no pre-built wheels for RIXL yet. Hence you cannot use the NixlConnector for PD disagg on AMD platforms without installing manually from source, limiting reproducibility and productivity. My suggestion is (to follow the current vLLM documentation and) ship RIXL with the ROCm image, at least until RIXL wheels are readily available.

RIXL is already installed in the test stage of the image, but not in the final stage. This PR installs it also in the final stage.

Note: This aligns the expected behavior with NV stack:

$ docker run --rm --entrypoint python3  vllm/vllm-openai:v0.20.1 -c "from nixl._api import nixl_agent; print('NIXL OK')"
NIXL OK

Test Plan

Testing on 8xMI300X node with Thor2 NICs.

1. Build

Build

docker build \
  -f docker/Dockerfile.rocm \
  --build-arg BASE_IMAGE=rocm/vllm-dev:base \
  -t vllm/vllm-openai-rocm:local \
  .

Depending on platform you might need RDMA userspace libs. This is for Thor2:

Expand for details

# docker/Dockerfile.rocm_dev
ARG BASE_IMAGE=vllm/vllm-openai-rocm:local
FROM ${BASE_IMAGE}

RUN apt-get update -q -y && apt-get install -q -y \
        autoconf \
        libtool \
        unzip \
        wget \
    && rm -rf /var/lib/apt/lists/*

# Thor2 (Broadcom BCM5760x) RDMA user-space driver (libbnxt_re).
# The inbox libbnxt_re-rdmav*.so shipped by libibverbs is renamed so the
# vendor build takes precedence via libibverbs provider discovery.
RUN wget -q \
        https://docs.broadcom.com/docs-and-downloads/ethernet-network-adapters/NXE/Thor2/GCA1/bcm5760x_230.2.52.0a.zip \
    && unzip -q bcm5760x_230.2.52.0a.zip \
    && cd bcm5760x_230.2.52.0a/drivers_linux/bnxt_rocelib/ \
    && tar -xf "$(find . -name 'libbnxt*.tar.gz' | head -n 1)" \
    && cd "$(find . -maxdepth 1 -type d -name 'libbnxt*' ! -name '*.tar.gz' | head -n 1)" \
    && sh autogen.sh \
    && ./configure \
    && make \
    && find /usr/lib64/ /usr/lib -name "libbnxt_re-rdmav*.so" \
         -exec mv {} {}.inbox \; 2>/dev/null || true \
    && make install all \
    && echo /usr/local/lib >> /etc/ld.so.conf \
    && ldconfig \
    && cp -f bnxt_re.driver /etc/libibverbs.d/ \
    && cd / \
    && rm -rf /bcm5760x_230.2.52.0a /bcm5760x_230.2.52.0a.zip

which we can build with

docker build \
    -f docker/Dockerfile.rocm_dev \
    -t vllm/vllm-openai-rocm:local-rixl \
    .

2. Test

RIXL importable:

docker run --rm --entrypoint python3  vllm/vllm-openai-rocm:local -c "from rixl._api import nixl_agent; print('RIXL OK')"

vLLM starts with NixlConnector without errors

docker run \
  --rm \
  --name rixl-prefill \
  --init --network host --ipc host --privileged \
  --cap-add SYS_PTRACE --security-opt seccomp=unconfined \
  --ulimit memlock=-1 --ulimit stack=67108864 \
  --shm-size 256G \
  --group-add video --group-add render \
  --device /dev/kfd --device /dev/dri --device /dev/infiniband \
  -v /sys:/sys \
  -v "${HOME}/.cache/huggingface:/root/.cache/huggingface" \
  -e HF_HOME=/root/.cache/huggingface \
  -e HF_HUB_ENABLE_HF_TRANSFER=0 \
  -e NCCL_MIN_NCHANNELS=112 \
  -e VLLM_ENGINE_READY_TIMEOUT_S=3600 \
  -e VLLM_ROCM_USE_AITER=1 \
  -e VLLM_ROCM_USE_AITER_PAGED_ATTN=0 \
  -e VLLM_ROCM_USE_AITER_RMSNORM=1 \
  -e NCCL_SOCKET_IFNAME=ens51np0 \
  vllm/vllm-openai-rocm:local-rixl \
  deepseek-ai/DeepSeek-R1-0528 \
    --port 8100 \
    --load-format dummy \
    --tensor-parallel-size 8 \
    --kv-cache-dtype fp8 \
    --gpu-memory-utilization 0.7 \
    --max-model-len 16384 \
    --trust-remote-code \
    --block-size 1 \
    --enforce-eager \
    --kv-transfer-config '{
      "kv_connector": "NixlConnector",
      "kv_role": "kv_producer"
    }'

Test Result

Importable:

$ docker run --rm --entrypoint python3  vllm/vllm-openai-rocm:local -c "from rixl._api import nixl_agent; print('RIXL OK')"
RIXL OK

vllm runs with NIXL:

...
(EngineCore pid=621) INFO 05-04 15:43:37 [core.py:306] init engine (profile, create kv cache, warmup model) took 23.54 s
(EngineCore pid=621) INFO 05-04 15:43:40 [factory.py:64] Creating v1 connector with name: NixlConnector and engine_id: 4babd70a-813e-475e-b651-85c47f14298a
(EngineCore pid=621) WARNING 05-04 15:43:40 [base.py:189] Initializing KVConnectorBase_V1. This API is experimental and subject to change in the future as we iterate the design.
(EngineCore pid=621) INFO 05-04 15:43:40 [scheduler.py:87] Initializing NIXL Scheduler 4babd70a-813e-475e-b651-85c47f14298a
(EngineCore pid=621) INFO 05-04 15:43:40 [scheduler.py:89] Hybrid Memory Allocator is enabled with NIXL
(EngineCore pid=621) INFO 05-04 15:43:42 [vllm.py:844] Asynchronous scheduling is enabled.
(EngineCore pid=621) WARNING 05-04 15:43:42 [vllm.py:900] Enforce eager set, disabling torch.compile and CUDAGraphs. This is equivalent to setting -cc.mode=none -cc.cudagraph_mode=none
(EngineCore pid=621) WARNING 05-04 15:43:42 [vllm.py:918] Inductor compilation was disabled by user settings, optimizations settings that are only active during inductor compilation will be ignored.
(EngineCore pid=621) INFO 05-04 15:43:42 [kernel.py:210] Final IR op priority after setting platform defaults: IrOpPriorityConfig(rms_norm=['vllm_c', 'native'], fused_add_rms_norm=['vllm_c', 'native'])
(EngineCore pid=621) INFO 05-04 15:43:42 [vllm.py:1093] Cudagraph is disabled under eager mode
(EngineCore pid=621) INFO 05-04 15:43:42 [compilation.py:303] Enabled custom fusions: norm_quant, act_quant, allreduce_rms
(APIServer pid=7) INFO 05-04 15:43:42 [api_server.py:598] Supported tasks: ['generate']
(APIServer pid=7) WARNING 05-04 15:43:42 [model.py:1449] Default vLLM sampling parameters have been overridden by the model's `generation_config.json`: `{'temperature': 0.6, 'top_p': 0.95}`. If this is not intended, please relaunch vLLM instance with `--generation-config vllm`.
(APIServer pid=7) INFO 05-04 15:43:45 [hf.py:482] Detected the chat template content format to be 'string'. You can set `--chat-template-content-format` to override this.
(APIServer pid=7) INFO 05-04 15:43:45 [api_server.py:602] Starting vLLM server on http://0.0.0.0:8100
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:37] Available routes are:
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /openapi.json, Methods: GET, HEAD
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /docs, Methods: GET, HEAD
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /docs/oauth2-redirect, Methods: GET, HEAD
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /redoc, Methods: GET, HEAD
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /tokenize, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /detokenize, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /load, Methods: GET
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /version, Methods: GET
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /health, Methods: GET
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /metrics, Methods: GET
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/models, Methods: GET
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /ping, Methods: GET
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /ping, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /invocations, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/chat/completions, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/chat/completions/batch, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/responses, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/responses/{response_id}, Methods: GET
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/responses/{response_id}/cancel, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/completions, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/messages, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/messages/count_tokens, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /inference/v1/generate, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /scale_elastic_ep, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /is_scaling_elastic_ep, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /generative_scoring, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/chat/completions/render, Methods: POST
(APIServer pid=7) INFO 05-04 15:43:45 [launcher.py:46] Route: /v1/completions/render, Methods: POST
(APIServer pid=7) INFO:     Started server process [7]
(APIServer pid=7) INFO:     Waiting for application startup.
(APIServer pid=7) INFO:     Application startup complete.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

gemini-code-assist

Code Review

This pull request updates the ROCm Dockerfile to install the RIXL wheel and its RDMA runtime dependencies, and sets the HSA_ENABLE_IPC_MODE_LEGACY environment variable to avoid GPU memory pinning issues. Feedback was provided to optimize the system package installation by adding the --no-install-recommends flag, removing an invalid flag from the update command, and reordering the instructions to improve Docker layer caching.

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

mergify · 2026-05-04T18:05:57Z

Documentation preview: https://vllm--41634.org.readthedocs.build/en/41634/

divakar-amd

Added 2 comments. Looks good overall.

… earrlier Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

divakar-amd

LGTM. Tested the change with a single-node 1P-1D disaggregated setup on mi300.

functionstackx · 2026-05-05T22:16:30Z

@simondanielsson thanks for this PR for having RIXL out of the box (for models where rixl is better than mori)

tjtanaa

LGTM

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: Libin Tang <libin.tang@intel.com>

fix: install RIXL wheel in final stage of Dockerfile.rocm

3a2a4e2

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

simondanielsson changed the title ~~[CI][ROCm] install RIXL wheel in final stage of Dockerfile.rocm~~ [CI][ROCm] Install RIXL wheel in final stage of Dockerfile.rocm May 4, 2026

mergify Bot added ci/build rocm Related to AMD ROCm labels May 4, 2026

github-project-automation Bot added this to AMD May 4, 2026

github-project-automation Bot moved this to Todo in AMD May 4, 2026

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

Comment thread docker/Dockerfile.rocm Outdated

fix: add --no-install-recommends and mvoe install before COPYs

cd4a320

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

simondanielsson mentioned this pull request May 4, 2026

[Installation]: RIXL not available #41637

Closed

1 task

simondanielsson marked this pull request as ready for review May 4, 2026 14:45

simondanielsson requested review from gshtras and tjtanaa as code owners May 4, 2026 14:45

claude Bot reviewed May 4, 2026

View reviewed changes

simondanielsson changed the title ~~[CI][ROCm] Install RIXL wheel in final stage of Dockerfile.rocm~~ [CI][ROCm] Ship RIXL with vllm/vllm-openai-rocm May 4, 2026

simondanielsson changed the title ~~[CI][ROCm] Ship RIXL with vllm/vllm-openai-rocm~~ [CI][ROCm] Ship RIXL with vllm/vllm-openai-rocm May 4, 2026

simondanielsson changed the title ~~[CI][ROCm] Ship RIXL with vllm/vllm-openai-rocm~~ [CI][ROCm] Ship RIXL with vllm/vllm-openai-rocm May 4, 2026

docs: update nixl documentation with path to correct image

8d9b6ec

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

mergify Bot added the documentation Improvements or additions to documentation label May 4, 2026

simondanielsson mentioned this pull request May 4, 2026

[Usage]: Using RIXL Connector on AMD GPU #37941

Open

divakar-amd reviewed May 5, 2026

View reviewed changes

Comment thread docker/Dockerfile.rocm Outdated

Comment thread docker/Dockerfile.rocm Outdated

fix: remove redundant RDMA runtime deps and move RIXL installation to…

c94ffdf

… earrlier Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

divakar-amd approved these changes May 5, 2026

View reviewed changes

simondanielsson mentioned this pull request May 5, 2026

[Feature]: Parity with CUDA: vLLM router should have ROCm CI #38693

Open

1 task

tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label May 8, 2026

tjtanaa approved these changes May 8, 2026

View reviewed changes

tjtanaa enabled auto-merge (squash) May 8, 2026 06:38

Merge branch 'main' into ci/rixl-final-image

6b3a617

tjtanaa merged commit f9b9bf3 into vllm-project:main May 8, 2026
13 of 14 checks passed

github-project-automation Bot moved this from Todo to Done in AMD May 8, 2026

simondanielsson deleted the ci/rixl-final-image branch May 8, 2026 07:19

libinta pushed a commit to libinta/vllm that referenced this pull request May 8, 2026

[CI][ROCm] Ship RIXL with vllm/vllm-openai-rocm (vllm-project#41634)

852f7bb

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: Libin Tang <libin.tang@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI][ROCm] Ship RIXL with `vllm/vllm-openai-rocm`#41634

[CI][ROCm] Ship RIXL with `vllm/vllm-openai-rocm`#41634
tjtanaa merged 5 commits intovllm-project:mainfrom
simondanielsson:ci/rixl-final-image

simondanielsson commented May 4, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

divakar-amd left a comment

Uh oh!

Uh oh!

Uh oh!

divakar-amd left a comment

Uh oh!

functionstackx commented May 5, 2026

Uh oh!

tjtanaa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

simondanielsson commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

1. Build

2. Test

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

divakar-amd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

divakar-amd left a comment

Choose a reason for hiding this comment

Uh oh!

functionstackx commented May 5, 2026

Uh oh!

tjtanaa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

simondanielsson commented May 4, 2026 •

edited

Loading