Skip to content

feat: Support vLLM v0.15.1 or newer#94

Merged
walterbm-cohere merged 1 commit intomainfrom
fix/vllm-compatibility
Apr 14, 2026
Merged

feat: Support vLLM v0.15.1 or newer#94
walterbm-cohere merged 1 commit intomainfrom
fix/vllm-compatibility

Conversation

@shun-cohere
Copy link
Copy Markdown
Contributor

@shun-cohere shun-cohere commented Apr 14, 2026

Description

This PR aims to support vLLM v0.15.1 or newer verions.
To do this, we introduce a conditional import logic at the top of parser.py

Related Issue

vllm-project/vllm#32240 introduces new structure, which is a breaking change for melody

Motivation and Context

Melody does not support vLLM v0.15 or newer verions.

How Has This Been Tested?

Check1: Confirm that import works with both vLLM versions

$ uv pip list | grep vllm
vllm                              0.15.1
$ uv run cohere_melody_vllm/parser.py
# no error
$ uv pip list | grep vllm
vllm                              0.14.1
$ uv run cohere_melody_vllm/parser.py
# no error

Check2: Tool Calling works with vLLM v0.15.1

Start server

uv run vllm serve CohereLabs/c4ai-command-r7b-12-2024 --reasoning-parser cohere2 --reasoning-parser-plugin ./cohere_melody_vllm/parser.py --tool-parser-plugin ./cohere_melody_vllm/parser.py --tool-call-parser cohere2 --enable-auto-tool-choice

and then send a tool calling query

$ uv run tool.py
ChatCompletion(id='chatcmpl-a8a2b2e52a4dc558', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-a9f3c1ba457d277f', function=Function(arguments='{"location": "San Francisco, California", "unit": "celsius"}', name='get_weather'), type='function')], reasoning='I will use the get_weather tool to find out the weather in San Francisco, California in Celsius.', reasoning_content='I will use the get_weather tool to find out the weather in San Francisco, California in Celsius.'), stop_reason=None, token_ids=None)], created=1776133545, model='CohereLabs/c4ai-command-r7b-12-2024', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=69, prompt_tokens=1302, total_tokens=1371, completion_tokens_details=None, prompt_tokens_details=None), prompt_logprobs=None, prompt_token_ids=None, kv_transfer_params=None)
'Function called: get_weather'
'Arguments: {"location": "San Francisco, California", "unit": "celsius"}'
'Result: Getting the weather for San Francisco, California in celsius...'

Note

Medium Risk
Adds version-dependent imports for vLLM OpenAI protocol types, so a mistake in version detection or module paths could cause runtime import failures across supported vLLM versions.

Overview
Adds vLLM version-aware import logic in cohere_melody_vllm/parser.py to handle the OpenAI entrypoint protocol module reorganization introduced after vLLM 0.14.1, enabling the plugin to run against both old and new layouts.

Updates the Python bindings CI py-check job to run ty check in a matrix against vLLM 0.14.1 and 0.15.1 to continuously validate compatibility.

Reviewed by Cursor Bugbot for commit e3aa4f8. Bugbot is set up for automated code reviews on this repo. Configure here.

@shun-cohere shun-cohere changed the title Support vLLM v0.15.1 or newer Fix: Support vLLM v0.15.1 or newer Apr 14, 2026
@shun-cohere shun-cohere changed the title Fix: Support vLLM v0.15.1 or newer feat: Support vLLM v0.15.1 or newer Apr 14, 2026
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 9e65a48. Configure here.

- Branch OpenAI entrypoint imports on vllm version: vllm > 0.14.1 uses
  the reorganized paths introduced in vllm-project/vllm#32240
- Add ty: ignore[unresolved-import] suppressions for version-gated
  imports that may not exist in the installed vllm
- Matrix py-check CI job across vllm 0.14.1 and 0.15.1
- Fix is_reasoning_end signature: list[int] -> Sequence[int] to match
  the abstract base class
@shun-cohere shun-cohere force-pushed the fix/vllm-compatibility branch from d3b7eed to e3aa4f8 Compare April 14, 2026 04:45
return content_ids

def is_reasoning_end(self, input_ids: list[int]) -> bool:
def is_reasoning_end(self, input_ids: Sequence[int]) -> bool:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make ty happy, otherwise the following error occurs

error[invalid-method-override]: Invalid override of method `is_reasoning_end`
   --> cohere_melody_vllm/parser.py:144:9
    |
142 |         return content_ids
143 |
144 |     def is_reasoning_end(self, input_ids: list[int]) -> bool:
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Definition is incompatible with `ReasoningParser.is_reasoning_end`
145 |         end_token_id = self.model_tokenizer.convert_tokens_to_ids("<|END_THINKING|>")
146 |         return any(input_id == end_token_id for input_id in reversed(input_ids))
    |
   ::: .venv/lib/python3.12/site-packages/vllm/reasoning/abs_reasoning_parsers.py:54:9
    |
 53 |     @abstractmethod
 54 |     def is_reasoning_end(self, input_ids: Sequence[int]) -> bool:
    |         -------------------------------------------------------- `ReasoningParser.is_reasoning_end` defined here
 55 |         """
 56 |         Check if the reasoning content ends in the input_ids.
    |

@walterbm-cohere walterbm-cohere merged commit b9560ae into main Apr 14, 2026
14 checks passed
@walterbm-cohere walterbm-cohere deleted the fix/vllm-compatibility branch April 14, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants