Update vLLM version support to include 0.14.0 and 0.14.1 by qgallouedec · Pull Request #5214 · huggingface/trl

qgallouedec · 2026-03-03T22:36:25Z

Summary

Extend TRL’s vLLM support to 0.14.0 and 0.14.1.

Changes

vLLM 0.14.0 introduced a breaking change: DP for dense models now errors out. From vllm-project/vllm#30739.

Reproducer and traceback

$ trl vllm-serve --model Qwen/Qwen2.5-1.5B --data_parallel_size 2
INFO:     Started server process [859382]
INFO:     Waiting for application startup.
Process Process-1:
Traceback (most recent call last):
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/multiprocessing/process.py", line 313, in _bootstrap
    self.run()
    ~~~~~~~~^^
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/trl/trl/scripts/vllm_serve.py", line 352, in llm_worker
    llm = LLM(
        model=script_args.model,
    ...<15 lines>...
        logprobs_mode="processed_logprobs",
    )
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/vllm/entrypoints/llm.py", line 338, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
                      ~~~~~~~~~~~~~~~~~~~~~~~~~~^
        engine_args=engine_args, usage_context=UsageContext.LLM_CLASS
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/vllm/v1/engine/llm_engine.py", line 168, in from_engine_args
    vllm_config = engine_args.create_engine_config(usage_context)
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/vllm/engine/arg_utils.py", line 1584, in create_engine_config
    parallel_config = ParallelConfig(
        pipeline_parallel_size=self.pipeline_parallel_size,
    ...<37 lines>...
        _api_process_rank=self._api_process_rank,
    )
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/pydantic/_internal/_dataclasses.py", line 121, in __init__
    s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for ParallelConfig
  Value error, Offline data parallel mode is not supported/useful for dense models. [type=value_error, input_value=ArgsKwargs((), {'pipeline...'_api_process_rank': 0}), input_type=ArgsKwargs]
    For further information visit https://errors.pydantic.dev/2.12/v/value_error
Process Process-2:
Traceback (most recent call last):
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/multiprocessing/process.py", line 313, in _bootstrap
    self.run()
    ~~~~~~~~^^
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/trl/trl/scripts/vllm_serve.py", line 352, in llm_worker
    llm = LLM(
        model=script_args.model,
    ...<15 lines>...
        logprobs_mode="processed_logprobs",
    )
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/vllm/entrypoints/llm.py", line 338, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
                      ~~~~~~~~~~~~~~~~~~~~~~~~~~^
        engine_args=engine_args, usage_context=UsageContext.LLM_CLASS
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/vllm/v1/engine/llm_engine.py", line 168, in from_engine_args
    vllm_config = engine_args.create_engine_config(usage_context)
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/vllm/engine/arg_utils.py", line 1584, in create_engine_config
    parallel_config = ParallelConfig(
        pipeline_parallel_size=self.pipeline_parallel_size,
    ...<37 lines>...
        _api_process_rank=self._api_process_rank,
    )
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/pydantic/_internal/_dataclasses.py", line 121, in __init__
    s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for ParallelConfig
  Value error, Offline data parallel mode is not supported/useful for dense models. [type=value_error, input_value=ArgsKwargs((), {'pipeline...'_api_process_rank': 0}), input_type=ArgsKwargs]
    For further information visit https://errors.pydantic.dev/2.12/v/value_error
ERROR:    Traceback (most recent call last):
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/site-packages/starlette/routing.py", line 694, in lifespan
    async with self.lifespan_context(app) as maybe_state:
               ~~~~~~~~~~~~~~~~~~~~~^^^^^
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/contextlib.py", line 214, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/trl/trl/scripts/vllm_serve.py", line 451, in lifespan
    msg = connection.recv()
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/multiprocessing/connection.py", line 430, in _recv_bytes
    buf = self._recv(4)
  File "/fsx/qgallouedec/miniconda3/envs/trl/lib/python3.13/multiprocessing/connection.py", line 399, in _recv
    raise EOFError
EOFError

ERROR:    Application startup failed. Exiting.

In my understanding, they say that scaling DP for dense models is always detrimental to performance. Which is surprising considering my old benchmark. Anyways, I recommend aligning with vLLM recommendations, and discourage scaling DP for dense model when even possible (vllm<0.14).

Tests

$ pytest tests/test_vllm_client_server.py
========================================== test session starts ==========================================
platform linux -- Python 3.13.11, pytest-9.0.2, pluggy-1.6.0
rootdir: /fsx/qgallouedec/trl
configfile: pyproject.toml
plugins: rerunfailures-15.1, anyio-4.12.1, xdist-3.8.0, datadir-1.8.0, cov-7.0.0
collected 37 items                                                                                      

tests/test_vllm_client_server.py ...............x............ssssss...                            [100%]

========================= 30 passed, 6 skipped, 1 xfailed in 425.60s (0:07:05) ==========================

qgallouedec · 2026-03-03T22:38:36Z

tests/test_vllm_client_server.py

+    def test_generate_with_params(self):
+        prompts = ["Hello, AI!", "Tell me a joke"]
+        completion_ids = self.client.generate(prompts, n=2, repetition_penalty=0.9, temperature=0.8, max_tokens=32)[
+            "completion_ids"
+        ]
+
+        # Check that the output is a list
+        assert isinstance(completion_ids, list)
+
+        # Check that the number of generated sequences is 2 times the number of prompts
+        assert len(completion_ids) == 2 * len(prompts)
+
+        # Check that the generated sequences are lists of integers
+        for seq in completion_ids:
+            assert all(isinstance(tok, int) for tok in seq)
+
+        # Check that the length of the generated sequences is less than or equal to 32
+        for seq in completion_ids:
+            assert len(seq) <= 32
+


not specific to vllm 0.14, but I realized that this test case was missing

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 59252637ef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

trl/import_utils.py

qgallouedec · 2026-03-03T22:38:51Z

tests/test_vllm_client_server.py

+    def test_generate_with_params(self):
+        prompts = ["Hello, AI!", "Tell me a joke"]
+        completion_ids = self.client.generate(prompts, n=2, repetition_penalty=0.9, temperature=0.8, max_tokens=32)[
+            "completion_ids"
+        ]
+
+        # Check that the output is a list
+        assert isinstance(completion_ids, list)
+
+        # Check that the number of generated sequences is 2 times the number of prompts
+        assert len(completion_ids) == 2 * len(prompts)
+
+        # Check that the generated sequences are lists of integers
+        for seq in completion_ids:
+            assert all(isinstance(tok, int) for tok in seq)
+
+        # Check that the length of the generated sequences is less than or equal to 32
+        for seq in completion_ids:
+            assert len(seq) <= 32
+


same as https://github.com/huggingface/trl/pull/5214/changes#r2880859543

albertvillanova

Thanks.

vllm 0.14

5925263

qgallouedec changed the title ~~vllm 0.14~~ Update vLLM version support to include 0.14.0 and 0.14.1 Mar 3, 2026

qgallouedec commented Mar 3, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 3, 2026

View reviewed changes

trl/import_utils.py Show resolved Hide resolved

qgallouedec commented Mar 3, 2026

View reviewed changes

qgallouedec requested review from AmineDiro, albertvillanova and kashif March 3, 2026 22:39

Merge branch 'main' into vllm-0.14

0f5545e

albertvillanova approved these changes Mar 4, 2026

View reviewed changes

qgallouedec merged commit 8f635b6 into main Mar 4, 2026
13 checks passed

qgallouedec deleted the vllm-0.14 branch March 4, 2026 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update vLLM version support to include 0.14.0 and 0.14.1#5214

Update vLLM version support to include 0.14.0 and 0.14.1#5214
qgallouedec merged 2 commits intomainfrom
vllm-0.14

qgallouedec commented Mar 3, 2026

Uh oh!

qgallouedec Mar 3, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

qgallouedec Mar 3, 2026

Uh oh!

albertvillanova left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

qgallouedec commented Mar 3, 2026

Summary

Changes

Tests

Uh oh!

qgallouedec Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

qgallouedec Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

albertvillanova left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants